INDEX
Negative Logits
stairs
-0.07
borrow
-0.07
responses
-0.06
(SK
-0.06
shall
-0.06
underestimate
-0.06
$error
-0.06
==>
-0.06
Ras
-0.06
격
-0.06
POSITIVE LOGITS
violent
0.06
708
0.06
�
0.06
941
0.06
Squared
0.06
67
0.06
εργ
0.06
coin
0.06
haft
0.06
164
0.06
Activations Density 0.000%