INDEX
Explanations
personal pronouns referencing 'he'
occurrences of the pronoun "he."
New Auto-Interp
Negative Logits
earch
-0.70
anking
-0.69
requisite
-0.67
htaking
-0.67
veyard
-0.66
iries
-0.65
Mandatory
-0.65
iscovery
-0.64
uy
-0.64
regulation
-0.63
POSITIVE LOGITS
'd
1.24
eded
1.06
'll
1.05
resy
0.96
knew
0.93
zbollah
0.92
thinks
0.88
aped
0.87
ctic
0.87
uristic
0.87
Activations Density 0.351%