INDEX
Explanations
names of individuals and specific groups related to politics and cinema
New Auto-Interp
Negative Logits
Lif
-0.59
LOAD
-0.59
itud
-0.56
down
-0.55
downs
-0.55
ð
-0.55
ithub
-0.54
inals
-0.52
ľ
-0.52
heart
-0.51
POSITIVE LOGITS
ervative
1.00
ervatives
0.78
ROR
0.51
intellectuals
0.48
ilogy
0.48
emporary
0.48
SpaceEngineers
0.47
schild
0.47
omsky
0.47
prototype
0.46
Activations Density 12.018%