INDEX
Explanations
punctuation marks and formatting in the text
New Auto-Interp
Negative Logits
crossorigin
-0.17
-unstyled
-0.16
kur
-0.16
usern
-0.16
tes
-0.15
inar
-0.15
lech
-0.14
717
-0.14
DUCT
-0.14
$LANG
-0.14
POSITIVE LOGITS
azo
0.18
Mg
0.17
erialize
0.17
counter
0.17
agara
0.16
Counter
0.16
Assistant
0.15
latter
0.15
He
0.15
Uk
0.14
Activations Density 0.003%