INDEX
Negative Logits
-box
-0.08
.↵
-0.08
et
-0.08
interaction
-0.08
box
-0.07
ung
-0.07
A
-0.07
-intensive
-0.07
advantage
-0.07
ناية
-0.07
POSITIVE LOGITS
nonetheless
0.12
disappointment
0.11
trotzdem
0.11
comunque
0.11
consolation
0.10
сожал
0.10
disappointed
0.10
ご了承
0.10
politely
0.10
gracefully
0.10
Activations Density 0.056%