INDEX
Explanations
the negation or sense of prohibition in statements
New Auto-Interp
Negative Logits
.*")]
-0.38
layui
-0.32
twimg
-0.31
a
-0.29
sosial
-0.27
↑
-0.27
Eloquent
-0.27
Mur
-0.27
aler
-0.26
مر
-0.26
POSITIVE LOGITS
ſind
0.75
ſehen
0.73
ſche
0.71
Weiſe
0.71
<unused41>
0.70
unſer
0.70
<unused79>
0.70
<unused52>
0.70
<unused11>
0.70
<pad>
0.69
Activations Density 0.000%