INDEX
Explanations
pronouns referring to people or groups
New Auto-Interp
Negative Logits
esternos
-0.64
(©
-0.59
propOrder
-0.58
EDEFAULT
-0.57
Chwiliwch
-0.56
betweenstory
-0.56
باخ
-0.55
profondément
-0.55
createCell
-0.54
Спасылкі
-0.53
POSITIVE LOGITS
inguém
0.85
Nobody
0.83
gente
0.82
Nobody
0.81
Nadie
0.79
nobody
0.78
owano
0.74
nobody
0.73
Nadie
0.71
אפשר
0.66
Activations Density 0.193%