INDEX
Explanations
personal pronouns followed by a verb
instances of the pronoun "He."
New Auto-Interp
Negative Logits
vable
-0.65
SPONSORED
-0.59
/$
-0.58
OUR
-0.54
AGES
-0.54
OUS
-0.53
ARCH
-0.52
ADS
-0.52
ARB
-0.51
vous
-0.51
POSITIVE LOGITS
He
2.88
His
2.27
He
2.15
Himself
1.81
Him
1.79
His
1.72
She
1.64
he
1.44
he
1.41
HE
1.30
Activations Density 0.075%