INDEX
Explanations
references to leadership and historical figures
New Auto-Interp
Negative Logits
avel
-0.15
ossa
-0.15
otos
-0.15
endra
-0.15
-produ
-0.15
Weiner
-0.15
aleza
-0.14
meisten
-0.14
ÃŃlia
-0.14
voks
-0.14
POSITIVE LOGITS
presidente
0.26
secret
0.25
Secret
0.22
due
0.22
President
0.21
socio
0.21
vice
0.21
director
0.20
ger
0.20
mi
0.20
Activations Density 0.032%