INDEX
Explanations
references to gendered pronouns
New Auto-Interp
Negative Logits
שוליים
-0.37
arşivlendi
-0.35
MessageOf
-0.33
huawei
-0.32
WebMethod
-0.31
]]);
-0.31
Emulator
-0.31
hnia
-0.30
chery
-0.30
ivelany
-0.30
POSITIVE LOGITS
ioutil
0.60
мәкал
0.58
enggak
0.57
Bewußt
0.55
RefNanny
0.54
illustrationer
0.53
Allgeme
0.53
Justiça
0.52
salvar
0.52
mewah
0.50
Activations Density 0.330%