INDEX
Negative Logits
undertook
-1.37
knew
-1.36
wrote
-1.29
はなく
-1.23
已经
-1.22
threw
-1.22
始めた
-1.21
stole
-1.18
confirmación
-1.17
withdrew
-1.16
POSITIVE LOGITS
taken
1.45
kept
1.27
deceit
1.11
lahko
1.10
capaces
1.09
גע
1.08
eaten
1.06
Terrasse
1.03
subjected
1.03
donnant
1.02
Activations Density 0.355%