INDEX
Explanations
phrases related to political advocacy groups and movements
references to organizations related to political prosperity and discord
New Auto-Interp
Negative Logits
ãĤª
-0.72
phony
-0.70
fries
-0.69
cigarettes
-0.68
EAR
-0.66
Bytes
-0.65
hy
-0.64
WAYS
-0.63
mitting
-0.63
Quote
-0.62
POSITIVE LOGITS
ments
0.99
iant
0.97
lain
0.91
iator
0.90
iary
0.90
naire
0.90
edly
0.90
ially
0.87
itude
0.87
iation
0.87
Activations Density 0.025%