INDEX
Explanations
expressions of personal relationships and social dynamics
New Auto-Interp
Negative Logits
esso
-0.75
dets
-0.68
ньому
-0.63
Оно
-0.46
[]*
-0.43
оно
-0.42
Rujuakan
-0.40
ILLES
-0.40
فيه
-0.39
gevens
-0.39
POSITIVE LOGITS
she
3.59
her
2.80
그녀
2.73
她
2.45
hennes
2.38
彼女は
2.36
její
2.31
彼女の
2.30
เธอ
2.27
she
2.25
Activations Density 1.769%