INDEX
Explanations
names of political figures
names of individuals, particularly political figures
New Auto-Interp
Negative Logits
itudinal
-0.68
Hemisphere
-0.68
inally
-0.67
matically
-0.65
boards
-0.64
ultras
-0.63
metic
-0.62
Else
-0.62
Colombian
-0.61
brightest
-0.61
POSITIVE LOGITS
ttes
1.00
lette
0.84
otte
0.82
borg
0.79
ecake
0.78
lected
0.77
zza
0.77
qv
0.77
lection
0.74
erest
0.73
Activations Density 0.018%