INDEX
Explanations
special characters and symbols in the text
New Auto-Interp
Negative Logits
ya
-0.18
blink
-0.16
arrow
-0.15
leh
-0.15
yntax
-0.15
yo
-0.14
plet
-0.14
fax
-0.14
none
-0.13
yal
-0.13
POSITIVE LOGITS
IJ
0.28
ĺ
0.24
ľ
0.22
Ķ
0.22
ĵ
0.21
ļ
0.20
Ľ
0.20
Ĵ
0.19
Ļ
0.19
ij
0.19
Activations Density 0.003%