INDEX
Explanations
punctuation and sentence endings
New Auto-Interp
Negative Logits
Bair
-0.17
-tm
-0.14
Raw
-0.14
ertos
-0.14
Mil
-0.14
omu
-0.14
ohon
-0.13
å¾³
-0.13
perimeter
-0.13
Brewer
-0.13
POSITIVE LOGITS
287
0.16
Baz
0.15
ni
0.14
út
0.14
ysz
0.14
ÙĨÚ¯ÛĮ
0.14
ÑĢÑĥп
0.14
pcodes
0.14
ãĥ¼ãĥ
0.14
leet
0.14
Activations Density 0.007%