INDEX
Explanations
names of individuals, particularly surnames, and familial connections
New Auto-Interp
Negative Logits
indle
-0.15
isphere
-0.15
raç
-0.14
же
-0.13
mut
-0.13
undra
-0.13
leness
-0.13
ilee
-0.13
abaj
-0.13
идеÑĤ
-0.13
POSITIVE LOGITS
اÙĨÙĪ
0.18
family
0.16
orum
0.15
ikut
0.14
ominator
0.14
(es
0.14
WithMany
0.14
deaux
0.13
éĺħ读次æķ°
0.13
siblings
0.13
Activations Density 0.175%