INDEX
Explanations
punctuation marks, particularly colons, and their usage in code
New Auto-Interp
Negative Logits
unic
-0.17
Towers
-0.15
oblins
-0.14
lical
-0.14
uš
-0.14
lico
-0.14
ahan
-0.13
lassen
-0.13
reas
-0.13
otec
-0.13
POSITIVE LOGITS
347
0.15
Glob
0.15
hai
0.15
ager
0.14
891
0.14
egie
0.14
Parr
0.14
าà¸ĺ
0.14
wear
0.14
357
0.14
Activations Density 0.001%