INDEX
Explanations
names of individuals in a specific context
names of individuals associated with notable actions or events
New Auto-Interp
Negative Logits
ardon
-0.91
xon
-0.82
utsche
-0.82
lov
-0.76
osta
-0.75
illance
-0.74
alm
-0.74
iary
-0.72
Naz
-0.71
err
-0.70
POSITIVE LOGITS
animate
0.83
Shirley
0.76
nces
0.76
nton
0.73
Iro
0.72
Lum
0.72
Sunder
0.72
plumbing
0.72
Hir
0.71
literacy
0.71
Activations Density 0.028%