INDEX
Explanations
the word "man" in various contexts
references to "man" and related expressions of masculinity
New Auto-Interp
Negative Logits
ADRA
-0.73
LCS
-0.73
PsyNetMessage
-0.71
Democratic
-0.69
RIPT
-0.68
ython
-0.67
Cosponsors
-0.67
IFT
-0.65
Detroit
-0.63
REDACTED
-0.62
POSITIVE LOGITS
ifest
1.13
hunt
1.07
hood
1.06
fred
1.02
gling
0.99
hattan
0.98
agers
0.95
liness
0.93
oeuv
0.92
uscript
0.92
Activations Density 0.061%