INDEX
Negative Logits
Doing
0.67
doing
0.66
Doing
0.63
doing
0.59
ทำ
0.50
melakukan
0.48
yapılan
0.43
hacerlo
0.43
conocimientos
0.40
conhecimentos
0.40
POSITIVE LOGITS
justice
0.61
differently
0.59
damage
0.55
groundwork
0.55
chores
0.54
thang
0.54
job
0.53
justice
0.53
damage
0.52
짓
0.52
Activations Density 0.076%