INDEX
Explanations
occurrences of the word "contain" and its variations
New Auto-Interp
Negative Logits
ſſung
-0.80
<unused68>
-0.79
<unused23>
-0.79
<unused52>
-0.79
<unused51>
-0.79
<unused14>
-0.79
[@BOS@]
-0.79
<unused8>
-0.78
<unused3>
-0.78
<pad>
-0.78
POSITIVE LOGITS
contain
0.68
containing
0.63
contains
0.62
contienen
0.54
contiene
0.53
contain
0.51
written
0.49
zawiera
0.47
bevat
0.47
содержит
0.47
Activations Density 0.172%