INDEX
Negative Logits
memberikan
-0.09
moderators
-0.08
perusahaan
-0.08
fold
-0.08
expressive
-0.08
)n
-0.08
folding
-0.08
τέρ
-0.08
utherland
-0.08
Arthur
-0.08
POSITIVE LOGITS
Kang
0.08
했습니다
0.08
itha
0.08
igue
0.08
ipherals
0.07
하세요
0.07
银
0.07
నివ
0.07
예방
0.07
accine
0.07
Activations Density 0.007%