INDEX
Explanations
names of individuals, particularly surnames
names or terms related to individuals, particularly those involved in politics or notable events
New Auto-Interp
Negative Logits
izable
-0.77
izing
-0.75
ifiable
-0.72
ising
-0.71
eness
-0.71
itudes
-0.70
ified
-0.68
ized
-0.67
izes
-0.67
ured
-0.65
POSITIVE LOGITS
vernment
0.83
ringe
0.83
izons
0.82
ritz
0.79
zzle
0.78
ellow
0.77
izontal
0.77
rian
0.77
avorite
0.73
rost
0.73
Activations Density 0.037%