INDEX
Explanations
personal pronouns like "he" or "they"
references to male individuals in a context of actions or descriptions
New Auto-Interp
Negative Logits
fuse
-0.70
brill
-0.69
Pegasus
-0.69
Armageddon
-0.68
Forth
-0.68
Hercules
-0.66
ripple
-0.66
Newark
-0.65
JFK
-0.65
Keynes
-0.64
POSITIVE LOGITS
âĢ
2.36
âĢ
1.80
âĶ
1.49
âľ
1.47
ãĢ
1.42
â
1.38
âĢł
1.38
âĸł
1.35
ï
1.33
¨
1.32
Activations Density 0.645%