INDEX
Explanations
the presence of specific punctuation marks and sentence structure indicators
New Auto-Interp
Negative Logits
1
-0.66
2
-0.64
7
-0.60
évêque
-0.58
hypotension
-0.56
bbene
-0.53
4
-0.52
is
-0.52
io
-0.52
avanti
-0.52
POSITIVE LOGITS
$_"
1.01
.~(\
0.98
"}")
0.96
клопе
0.94
`,
0.89
NUMX
0.89
lenker
0.87
WriteBarrier
0.85
".
0.84
AccessorTable
0.84
Activations Density 0.127%