INDEX
Explanations
references to political figures and their activities
New Auto-Interp
Negative Logits
itas
-0.17
opolitan
-0.15
aro
-0.15
ане
-0.14
AGO
-0.14
ile
-0.14
rál
-0.14
tran
-0.14
ÙĪÙħاÙĨ
-0.14
venience
-0.14
POSITIVE LOGITS
ifter
0.19
inaire
0.15
Excell
0.14
Chore
0.14
shal
0.14
ore
0.14
inan
0.14
amon
0.14
655
0.14
ina
0.13
Activations Density 0.033%