INDEX
Explanations
names or terms related to political figures or leaders
references to specific individuals, particularly political figures
New Auto-Interp
Negative Logits
catapult
-0.68
OPLE
-0.64
lamm
-0.60
horm
-0.60
mast
-0.59
rake
-0.59
context
-0.59
wcsstore
-0.58
Roundup
-0.58
osterone
-0.58
POSITIVE LOGITS
vu
0.89
issance
0.78
heed
0.75
ongo
0.73
warm
0.73
zee
0.72
emort
0.71
eur
0.71
inho
0.71
anamo
0.70
Activations Density 0.219%