INDEX
Explanations
references to the word "man" and its variations
New Auto-Interp
Negative Logits
itſelf
-1.03
ſelves
-1.02
ſelf
-1.01
myſelf
-1.00
pleaſure
-0.97
preſent
-0.92
purpoſe
-0.92
eſſ
-0.90
ſeveral
-0.90
"]);
-0.88
POSITIVE LOGITS
man
3.44
Man
2.58
Man
2.30
MAN
2.17
homem
2.07
hombre
2.00
woman
1.97
men
1.96
mans
1.91
man
1.80
Activations Density 0.053%