INDEX
Negative Logits
و
1.10
।
1.09
し
0.93
În
0.91
ueur
0.91
ো
0.88
subjects
0.83
па
0.82
Ин
0.82
it
0.81
POSITIVE LOGITS
:
1.92
)
1.33
(
1.23
/
1.17
;
1.16
),
1.07
$
1.05
Abuse
1.00
ق
0.97
abusive
0.97
Activations Density 0.015%
و
।
し
În
ueur
ো
subjects
па
Ин
it
:
)
(
/
;
),
$
Abuse
ق
abusive