INDEX
Explanations
capital letters in an unusual format
specific symbol patterns and certain unique characters
New Auto-Interp
Negative Logits
ithing
-0.75
Peb
-0.69
ulators
-0.68
bro
-0.67
iations
-0.64
Roads
-0.63
aturdays
-0.63
theless
-0.62
oys
-0.62
MEN
-0.61
POSITIVE LOGITS
ª
1.49
©
1.44
Ļ
1.38
Ķ
1.31
¢
1.31
¾
1.29
¡
1.28
¨
1.27
IJ
1.24
Ľ
1.24
Activations Density 0.028%