INDEX
Explanations
references to hereditary traits or abilities
New Auto-Interp
Negative Logits
hton
-0.18
rå
-0.16
éĤ¦
-0.16
arris
-0.16
ostel
-0.16
amaha
-0.15
amas
-0.15
æ¡ĥ
-0.15
妻
-0.14
andler
-0.14
POSITIVE LOGITS
Her
0.31
her
0.31
her
0.30
Her
0.29
editary
0.28
HER
0.26
HER
0.22
itage
0.22
/her
0.22
etical
0.21
Activations Density 0.011%