INDEX
Explanations
repeated use of the pronoun "they."
New Auto-Interp
Negative Logits
Efq
-0.63
Hins
-0.62
Thon
-0.62
fince
-0.59
purpoſe
-0.59
pomo
-0.57
Bue
-0.56
Nox
-0.55
šķ
-0.55
Monfieur
-0.55
POSITIVE LOGITS
they
2.91
They
2.49
They
2.47
they
2.34
THEY
2.24
THEY
2.21
mereka
1.70
они
1.63
he
1.63
Mereka
1.60
Activations Density 0.094%