INDEX
Explanations
phrases indicating a conclusion or finality
New Auto-Interp
Negative Logits
elow
-0.07
(Op
-0.07
izzo
-0.06
à¹īาà¸ĩ
-0.06
nost
-0.06
_Execute
-0.06
******↵↵
-0.06
scal
-0.06
ossier
-0.06
γκα
-0.06
POSITIVE LOGITS
over
0.33
Over
0.23
Over
0.23
over
0.23
_over
0.22
OVER
0.22
-over
0.20
sobre
0.19
è¿ĩ
0.18
.over
0.18
Activations Density 0.057%