INDEX
Explanations
relative clauses and pronouns referring to people
New Auto-Interp
Negative Logits
desorption
-0.65
Summa
-0.64
Alte
-0.59
lavado
-0.58
alberi
-0.56
spher
-0.56
Appreciate
-0.56
NEO
-0.56
blest
-0.56
olev
-0.55
POSITIVE LOGITS
who
1.15
[]:
1.02
która
0.84
`;
0.83
примеча
0.82
)";
0.82
osoever
0.82
%
0.81
quien
0.80
والذي
0.80
Activations Density 0.527%