INDEX
Explanations
references to a specific person indicated by the pronoun "him."
references to a specific individual
New Auto-Interp
Negative Logits
anking
-0.63
services
-0.63
Result
-0.61
Amanda
-0.61
itures
-0.60
reach
-0.60
give
-0.60
profits
-0.59
Annie
-0.59
mble
-0.59
POSITIVE LOGITS
personally
0.94
panic
0.88
Majesty
0.86
ading
0.85
atically
0.83
atic
0.82
soever
0.79
orally
0.77
alian
0.75
zbollah
0.73
Activations Density 0.096%