INDEX
Explanations
references to specific individuals
mentions of the pronoun "he."
New Auto-Interp
Negative Logits
anking
-0.66
Transactions
-0.65
Claire
-0.64
requisite
-0.63
earch
-0.63
uy
-0.63
history
-0.63
veyard
-0.62
Bearing
-0.61
Girls
-0.61
POSITIVE LOGITS
'd
1.27
eded
1.10
'll
1.08
resy
0.93
knew
0.92
uristic
0.91
aped
0.90
zbollah
0.89
've
0.86
ctic
0.86
Activations Density 0.344%