INDEX
Explanations
punctuation marks, specifically periods
New Auto-Interp
Negative Logits
it
-0.16
aru
-0.16
ix
-0.15
''
-0.15
ll
-0.15
↵↵↵↵
-0.15
Kraj
-0.15
iy
-0.14
olume
-0.14
'''↵
-0.14
POSITIVE LOGITS
ORG
0.20
º
0.19
rais
0.17
ystore
0.17
stor
0.16
ustr
0.16
00
0.16
ª
0.16
gon
0.16
jpg
0.15
Activations Density 0.158%