INDEX
Explanations
pronouns followed by a verb related to possession or action
the use of first and third-person plural pronouns
New Auto-Interp
Negative Logits
yss
-0.76
Barron
-0.74
aughtered
-0.72
phans
-0.67
retty
-0.66
raints
-0.66
abel
-0.65
rets
-0.63
jri
-0.61
hatt
-0.61
POSITIVE LOGITS
'd
0.92
'll
0.84
've
0.83
proport
0.81
deems
0.77
ourselves
0.76
RL
0.76
could
0.76
might
0.75
presently
0.73
Activations Density 0.155%