INDEX
Explanations
references to individuals and their professions
New Auto-Interp
Negative Logits
á»iji
-0.14
áte
-0.14
fak
-0.14
ži
-0.13
omor
-0.13
abilit
-0.13
legate
-0.13
oment
-0.13
raham
-0.13
plen
-0.13
POSITIVE LOGITS
politician
0.18
polit
0.18
rowNum
0.17
politik
0.17
Polit
0.16
kinson
0.16
polÃŃtica
0.16
ë°°ìļ°
0.15
political
0.15
actor
0.15
Activations Density 0.036%