INDEX
Explanations
references to historical political systems and their characteristics
New Auto-Interp
Negative Logits
ç¨İ
-0.16
iem
-0.15
cap
-0.14
uko
-0.14
oshi
-0.14
à¤Łà¤°
-0.14
.nano
-0.13
oha
-0.13
ä¿ĿæĬ¤
-0.13
neh
-0.13
POSITIVE LOGITS
Stalin
0.19
Lenin
0.16
Soviet
0.16
centrally
0.14
erdale
0.14
Pablo
0.14
Reporting
0.13
Len
0.13
bure
0.13
workers
0.13
Activations Density 0.115%