INDEX
Explanations
statements expressing opinions or emotions
New Auto-Interp
Negative Logits
⇀
-0.39
rhiz
-0.39
盲
-0.38
kuma
-0.38
präche
-0.37
direct
-0.36
Direct
-0.34
LESS
-0.33
gos
-0.33
النه
-0.33
POSITIVE LOGITS
autorytatywna
0.69
styleType
0.56
grà
0.52
hoeddwyd
0.52
Personendaten
0.49
verwijspagina
0.46
دیکھیے
0.46
相关文章
0.46
onOptions
0.46
ंदीखरीदारी
0.45
Activations Density 0.850%