INDEX
Explanations
special characters likely representing hidden or encoded information
the end-of-text token and specific unique or special characters
New Auto-Interp
Negative Logits
scattering
-0.76
wink
-0.74
gad
-0.73
gag
-0.69
clad
-0.68
casting
-0.68
whichever
-0.68
scatter
-0.68
planting
-0.68
Peb
-0.67
POSITIVE LOGITS
º
1.60
¹
1.47
Į
1.41
į
1.40
£
1.39
ª
1.38
®
1.37
»
1.37
¯
1.30
Ń
1.30
Activations Density 0.125%