INDEX
Explanations
the pronoun "who" and its references
New Auto-Interp
Negative Logits
Theſe
-1.34
Efq
-1.26
Jefus
-1.16
^(@)
-1.13
Anſ
-1.11
theſe
-1.10
doubtnut
-1.09
་་
-1.09
Majefty
-1.08
ConstraintMaker
-1.07
POSITIVE LOGITS
<eos>
0.82
to
0.82
who
0.81
0.74
if
0.72
in
0.72
the
0.71
or
0.70
for
0.68
I
0.68
Activations Density 0.115%