INDEX
Explanations
punctuation and formatting elements in the text
Follows punctuation or whitespace
New Auto-Interp
Negative Logits
:✨
-0.82
صوتيه
-0.77
OGND
-0.77
-0.74
Italijani
-0.73
queſta
-0.73
Meksiku
-0.72
Administrativna
-0.71
ſſung
-0.71
Diweddarwch
-0.70
POSITIVE LOGITS
↵↵
0.54
↵↵↵
0.44
↵
0.43
Two
0.42
An
0.41
Two
0.39
An
0.39
2
0.39
.
0.38
I
0.38
Activations Density 0.007%