INDEX
Explanations
questions that begin with "how."
New Auto-Interp
Negative Logits
him
-0.19
herself
-0.17
eux
-0.14
ÏĦαν
-0.14
himself
-0.14
him
-0.14
Them
-0.14
them
-0.14
lui
-0.13
.opend
-0.13
POSITIVE LOGITS
/if
0.40
they
0.37
we
0.35
best
0.33
else
0.31
much
0.31
soever
0.31
exactly
0.30
you
0.28
itzer
0.28
Activations Density 0.120%