INDEX
Explanations
tokens or characters with high frequency or specific formatting, potentially indicative of programming or code elements
New Auto-Interp
Negative Logits
.stamp
-0.17
errat
-0.17
vetica
-0.16
olini
-0.16
ivent
-0.14
sab
-0.14
ream
-0.13
Sab
-0.13
466
-0.13
steder
-0.13
POSITIVE LOGITS
point
0.25
Point
0.25
che
0.22
point
0.22
-point
0.21
Point
0.21
position
0.20
pop
0.20
é»ŀ
0.19
che
0.19
Activations Density 0.007%