INDEX
Explanations
references to personal or familial connections
New Auto-Interp
Negative Logits
esso
-0.73
dets
-0.71
ньому
-0.62
Оно
-0.47
[]*
-0.42
оно
-0.41
byshev
-0.40
gevens
-0.40
ILLES
-0.39
<<<<<<<<<<<<<<
-0.38
POSITIVE LOGITS
she
3.67
her
2.91
그녀
2.84
她
2.52
hennes
2.50
彼女は
2.50
彼女の
2.42
její
2.42
เธอ
2.36
she
2.31
Activations Density 1.976%