INDEX
Explanations
phrases introducing examples or comparisons
New Auto-Interp
Negative Logits
Carriera
-0.62
båda
-0.54
XNUMX
-0.53
AndEndTag
-0.52
UnusedPrivate
-0.52
Damit
-0.50
Damit
-0.50
Jefus
-0.50
ſame
-0.49
ſeveral
-0.49
POSITIVE LOGITS
those
1.28
those
1.03
:
0.89
ours
0.87
namely
0.86
גון
0.84
Those
0.84
namely
0.81
เช่น
0.81
celles
0.80
Activations Density 0.418%