INDEX
Negative Logits
abnormalities
-0.07
_new
-0.06
wrappers
-0.06
Increase
-0.06
Baldwin
-0.06
_Buffer
-0.06
camps
-0.06
settled
-0.06
politics
-0.06
definite
-0.06
POSITIVE LOGITS
�
0.07
zx
0.06
okay
0.06
남
0.06
yaptığ
0.06
VK
0.06
094
0.06
.dto
0.06
freaking
0.06
مك
0.06
Activations Density 0.014%