INDEX
Explanations
references to significant social and political events
New Auto-Interp
Negative Logits
ÅĦ
-0.15
лÑıд
-0.15
iom
-0.15
lash
-0.14
assembly
-0.14
gel
-0.14
Miy
-0.14
lops
-0.13
utter
-0.13
rier
-0.13
POSITIVE LOGITS
znam
0.18
_misc
0.14
èŀ
0.13
ãĤ¸
0.13
ICATION
0.13
Ãľn
0.13
WEEN
0.13
oola
0.13
oret
0.13
igham
0.13
Activations Density 2.424%