INDEX
Explanations
punctuation and specific phrases indicating actions or interactions
Text after commas
words following lists
New Auto-Interp
Negative Logits
useParams
-0.32
</em>
-0.31
↵↵
-0.29
einzu
-0.29
lleg
-0.29
in
-0.28
↵
-0.28
Zus
-0.27
ek
-0.27
"
-0.27
POSITIVE LOGITS
ſſung
0.84
ſelf
0.82
<unused16>
0.81
[@BOS@]
0.81
<unused8>
0.81
<unused43>
0.81
<unused80>
0.81
<unused41>
0.81
<unused3>
0.81
<unused23>
0.81
Activations Density 0.389%