INDEX
Explanations
instances of the pronoun "we"
New Auto-Interp
Negative Logits
purpoſe
-0.80
fince
-0.75
Monfieur
-0.74
itſelf
-0.74
Thon
-0.71
Houſe
-0.70
Reſ
-0.70
Eſ
-0.68
whoſe
-0.67
Efq
-0.67
POSITIVE LOGITS
we
2.26
they
1.50
he
1.28
we
1.27
you
1.26
I
1.22
she
1.18
our
1.13
THEY
1.07
We
1.00
Activations Density 0.124%