INDEX
Negative Logits
_corpus
-0.08
aşırı
-0.07
anybody
-0.07
'D
-0.07
νονται
-0.07
stating
-0.07
Lastly
-0.07
Không
-0.06
ług
-0.06
ічного
-0.06
POSITIVE LOGITS
down
0.07
gratitude
0.07
UP
0.07
Down
0.07
Michael
0.06
명
0.06
Portugal
0.06
PLIER
0.06
deltaY
0.06
McCorm
0.06
Activations Density 0.031%