INDEX
Explanations
mentions of social control and manipulation themes
New Auto-Interp
Negative Logits
tens
-0.16
надо
-0.15
_parms
-0.14
:-)
-0.14
oka
-0.14
Ù쨱
-0.14
asyarak
-0.14
аÑĪ
-0.14
OTO
-0.14
lots
-0.13
POSITIVE LOGITS
ëĺIJíķľ
0.15
Furthermore
0.14
sonst
0.14
null
0.14
industry
0.14
jov
0.13
ª
0.13
OTHERWISE
0.13
-esque
0.13
Furthermore
0.13
Activations Density 2.303%