INDEX
Negative Logits
लागे
0.74
슭
0.68
zeich
0.66
陟
0.66
ϱ
0.66
Polynomial
0.64
Hyg
0.64
אור
0.63
plication
0.62
纶
0.62
POSITIVE LOGITS
anger
3.47
angry
3.19
rage
3.01
enraged
2.79
angrily
2.61
Anger
2.61
fury
2.61
angered
2.56
愤怒
2.52
resentment
2.48
Activations Density 0.675%