INDEX
Explanations
occurrences of quotes or quotation marks
New Auto-Interp
Negative Logits
“
-0.32
âĢŀ
-0.29
ãĢĮ
-0.23
``
-0.22
“[
-0.22
(“
-0.21
ãĢĮãģĤ
-0.20
ãĢĮ
-0.20
ãĢĮãģĬ
-0.19
«
-0.19
POSITIVE LOGITS
[]"
0.19
()"
0.17
rient
0.16
{'0.16
gnore
0.15
¦
0.15
!"
0.14
alloca
0.14
."↵↵
0.14
","","
0.14
Activations Density 0.610%