INDEX
Negative Logits
man
-0.08
partisan
-0.07
maken
-0.07
specified
-0.07
An
-0.06
shape
-0.06
АН
-0.06
debe
-0.06
title
-0.06
rama
-0.06
POSITIVE LOGITS
could
0.10
Could
0.08
could
0.07
CID
0.07
coc
0.07
Took
0.07
отдел
0.07
�
0.07
coping
0.07
ักก
0.07
Activations Density 0.086%