INDEX
Explanations
the presence of special characters or symbols in the text
New Auto-Interp
Negative Logits
greateſt
-1.09
ſelf
-1.04
ſeveral
-1.03
Monfieur
-1.01
Reſ
-1.01
ſche
-1.00
Anſ
-0.99
Jefus
-0.98
Diſ
-0.96
Theſe
-0.96
POSITIVE LOGITS
&
1.25
[…]
0.60
(
0.59
FormState
0.58
(&
0.58
RunWith
0.57
=&
0.55
&
0.54
nahilalakip
0.54
/
0.53
Activations Density 0.158%