INDEX
Explanations
terms related to political ideologies and actions
terms related to political or social concepts, particularly those associated with democracy and governance
New Auto-Interp
Negative Logits
MV
-0.65
IF
-0.65
RH
-0.64
STATS
-0.64
CM
-0.62
notwithstanding
-0.61
Hobby
-0.59
rolling
-0.59
showc
-0.58
TW
-0.58
POSITIVE LOGITS
eness
0.99
ated
0.98
acist
0.96
ates
0.95
ruction
0.93
ration
0.92
acists
0.91
uments
0.91
ciating
0.91
andum
0.91
Activations Density 0.118%