INDEX
Explanations
third-person singular pronouns, specifically 'he'
references to male subjects or individuals
New Auto-Interp
Negative Logits
Claire
-0.67
Girls
-0.66
Columb
-0.66
Transactions
-0.64
anking
-0.62
Bunny
-0.62
Bearing
-0.62
Unlimited
-0.61
Engineers
-0.61
Leban
-0.60
POSITIVE LOGITS
'd
1.22
eded
1.17
'll
1.15
zbollah
1.09
redes
0.91
eding
0.88
encount
0.87
resy
0.87
've
0.85
aps
0.83
Activations Density 0.382%