INDEX
Explanations
personal pronouns referring to a male individual
mentions of a specific individual
New Auto-Interp
Negative Logits
Pastebin
-0.67
profits
-0.61
Railroad
-0.59
ITS
-0.58
Idle
-0.57
communities
-0.57
Gems
-0.56
AM
-0.56
appl
-0.55
uve
-0.54
POSITIVE LOGITS
personally
1.12
panic
0.99
ading
0.92
Majesty
0.90
zbollah
0.87
atically
0.86
enegger
0.85
himself
0.84
eded
0.81
sing
0.81
Activations Density 0.105%