INDEX
Negative Logits
Baldwin
-0.07
أبو
-0.07
jente
-0.06
430
-0.06
250
-0.06
attacking
-0.06
thôn
-0.06
کامپی
-0.06
eaten
-0.06
hlav
-0.06
POSITIVE LOGITS
https
0.09
https
0.09
ners
0.07
�
0.07
_processing
0.07
://
0.07
biology
0.07
าศ
0.07
By
0.06
_loss
0.06
Activations Density 0.032%