INDEX
Explanations
colons and their presence in the text
New Auto-Interp
Negative Logits
ãĤ¾
-0.19
ëıħ
-0.15
boro
-0.15
juan
-0.14
sed
-0.14
sed
-0.14
addon
-0.14
ilio
-0.14
acht
-0.14
ilton
-0.14
POSITIVE LOGITS
ismatic
0.14
enas
0.14
Touch
0.14
elik
0.13
imb
0.13
_anchor
0.13
injunction
0.13
isti
0.13
osp
0.13
ovnÃŃ
0.13
Activations Density 0.009%