INDEX
Explanations
sentence-ending punctuation marks and formatting elements
New Auto-Interp
Negative Logits
subs
-0.15
j
-0.14
ih
-0.14
Pron
-0.14
loc
-0.13
ont
-0.13
ypo
-0.13
à¹īาà¸ĩ
-0.13
Spiel
-0.12
Lay
-0.12
POSITIVE LOGITS
\CMS
0.15
xdb
0.15
å±ĭ
0.14
ضÛĮ
0.14
akis
0.14
Evel
0.14
é£
0.14
gili
0.13
089
0.13
428
0.13
Activations Density 2.568%