INDEX
Explanations
the presence of specific formatting or code-like structures
New Auto-Interp
Negative Logits
-0.50
↵
-0.48
<eos>
-0.47
↵↵
-0.43
Masse
-0.43
fetchone
-0.42
就是
-0.42
handle
-0.41
що
-0.41
彦
-0.41
POSITIVE LOGITS
enumi
0.98
neſs
0.90
ſelves
0.89
enumii
0.84
itſelf
0.83
Houſe
0.82
Мексичка
0.82
Anſ
0.82
purpoſe
0.82
Portail
0.82
Activations Density 0.017%