INDEX
Explanations
emotional expressions or sentiments related to achievement and recognition
New Auto-Interp
Negative Logits
яке
-0.79
quels
-0.75
Оно
-0.74
ньому
-0.73
ambao
-0.68
оно
-0.67
которое
-0.67
dets
-0.64
ueltos
-0.64
οποίο
-0.63
POSITIVE LOGITS
her
4.48
she
4.04
herself
3.43
she
2.93
hers
2.79
彼女の
2.76
그녀
2.75
hennes
2.71
她的
2.67
她
2.67
Activations Density 2.404%