INDEX
Explanations
occurrences of the word "and" in the text
New Auto-Interp
Negative Logits
RenderAtEndOf
-0.84
Rujuakan
-0.71
zwiſchen
-0.71
pinulongan
-0.70
deſſen
-0.68
<unused41>
-0.68
<unused68>
-0.68
<unused23>
-0.68
<unused74>
-0.68
<pad>
-0.68
POSITIVE LOGITS
I
0.48
hence
0.48
it
0.47
we
0.47
but
0.46
I
0.46
but
0.44
which
0.44
is
0.43
而
0.42
Activations Density 0.277%