INDEX
Explanations
references to neo movements and ideologies
New Auto-Interp
Negative Logits
ullo
-0.17
itarian
-0.17
amoto
-0.15
['__
-0.15
Weiter
-0.15
ksi
-0.14
UTE
-0.14
olumn
-0.14
ALLY
-0.13
Cumhur
-0.13
POSITIVE LOGITS
/ne
0.22
-Nazi
0.19
dle
0.18
ÄijáºŃu
0.16
cons
0.16
Ne
0.16
Äįek
0.16
(ne
0.16
atal
0.15
Vintage
0.15
Activations Density 0.005%