INDEX
Negative Logits
birbir
-0.08
ль
-0.07
络
-0.07
ESİ
-0.06
assword
-0.06
亚洲
-0.06
_soup
-0.06
Lakers
-0.06
烈
-0.06
Ply
-0.06
POSITIVE LOGITS
against
0.07
very
0.07
verification
0.07
(dirname
0.06
_EVENT
0.06
disput
0.06
xương
0.06
THIS
0.06
condol
0.06
(choice
0.06
Activations Density 0.038%