INDEX
Explanations
references to body parts, particularly legs, thighs, feet, hands, and wings
references to body parts and physical features
New Auto-Interp
Negative Logits
claimant
-0.66
eers
-0.64
naire
-0.59
referen
-0.58
insider
-0.58
antidepressant
-0.58
retrospective
-0.57
psychiatry
-0.56
sender
-0.56
atheist
-0.56
POSITIVE LOGITS
pring
1.28
mith
1.25
uits
1.24
peed
1.23
poons
1.21
pots
1.19
cape
1.18
creen
1.18
pread
1.17
ight
1.12
Activations Density 0.195%