INDEX
Explanations
mentions of physical human body parts
references to physical human anatomy and conditions
New Auto-Interp
Negative Logits
inel
-0.72
Emerson
-0.71
agall
-0.67
ARI
-0.65
Soros
-0.65
vernment
-0.62
AMA
-0.62
ccess
-0.61
USER
-0.60
Hussain
-0.60
POSITIVE LOGITS
lings
0.96
bags
0.96
meat
0.91
bag
0.87
ciating
0.85
mask
0.84
mares
0.84
shed
0.84
burn
0.82
thirst
0.82
Activations Density 0.012%