INDEX
Explanations
the word "man" with different levels of activation, focusing on descriptions or actions related to a man
mentions of the word "man."
New Auto-Interp
Negative Logits
soType
-0.83
ettings
-0.67
emies
-0.66
agall
-0.63
ances
-0.63
Cosponsors
-0.62
iances
-0.62
ancing
-0.61
alysis
-0.61
Lerner
-0.61
POSITIVE LOGITS
man
3.40
woman
1.98
Man
1.91
men
1.90
guy
1.80
Man
1.78
mans
1.72
boy
1.72
dude
1.69
man
1.66
Activations Density 0.042%