INDEX
Explanations
counts and mentions of various categories of people, particularly in political and film contexts
New Auto-Interp
Negative Logits
Ñľ
-0.16
ozem
-0.15
tual
-0.15
kuk
-0.14
ยว
-0.14
/***/
-0.14
ernes
-0.14
Ñīик
-0.14
uell
-0.13
&)↵
-0.13
POSITIVE LOGITS
↵↵
0.18
↵
0.17
stub
0.16
Uns
0.16
ÏĩÏģι
0.16
avou
0.15
/commons
0.15
unft
0.14
staw
0.14
ogo
0.14
Activations Density 0.013%