INDEX
Explanations
characters and patterns commonly used in programming or regex expressions
New Auto-Interp
Negative Logits
acho
-0.17
aul
-0.16
ëħ
-0.15
æīĢ
-0.15
ample
-0.15
à¸į
-0.14
ãĥ¼ãĥĹ
-0.14
asses
-0.14
Zem
-0.14
浦
-0.13
POSITIVE LOGITS
+
0.20
+↵
0.17
]+
0.16
*
0.15
)+
0.15
+:
0.15
IPA
0.14
)*
0.14
){0.14
{0.14
Activations Density 0.036%