INDEX
Negative Logits
urdy
0.41
போல
0.41
Differences
0.38
отличие
0.38
`<=`
0.38
nontrivial
0.38
successful
0.38
colormap
0.37
`<=
0.36
urity
0.36
POSITIVE LOGITS
behave
0.92
behaves
0.90
behaved
0.88
behaving
0.80
behaved
0.64
doing
0.63
melakukan
0.57
Doing
0.54
采取
0.54
adopts
0.52
Activations Density 0.018%