INDEX
Explanations
pronouns indicating bodily parts, particularly head
possessive pronouns indicating ownership or relationships
New Auto-Interp
Negative Logits
liest
-1.02
imester
-0.84
rentice
-0.84
Izan
-0.82
nown
-0.75
gov
-0.74
ablishment
-0.74
aman
-0.73
equality
-0.70
cy
-0.67
POSITIVE LOGITS
sleeves
1.32
fingers
1.28
noses
1.27
nose
1.25
fists
1.22
heels
1.21
toes
1.20
teeth
1.16
claws
1.16
hips
1.16
Activations Density 0.095%