INDEX
Explanations
instances of special characters and punctuation in text
New Auto-Interp
Negative Logits
794
-0.16
iran
-0.16
ÑijÑĢ
-0.15
олод
-0.14
aldo
-0.14
erged
-0.14
Tan
-0.14
arbon
-0.14
iel
-0.14
astered
-0.14
POSITIVE LOGITS
onda
0.15
rian
0.14
鹿
0.14
urette
0.14
Vers
0.14
disp
0.14
_logits
0.13
bies
0.13
cles
0.13
Hed
0.13
Activations Density 0.003%