INDEX
Negative Logits
handjob
-0.08
èn
-0.07
Crack
-0.07
takeover
-0.06
допом
-0.06
governments
-0.06
warz
-0.06
Born
-0.06
Τα
-0.06
Criterion
-0.06
POSITIVE LOGITS
(effect
0.07
ultip
0.06
есте
0.06
ATRIX
0.06
udev
0.06
('__0.06
arası
0.06
scanf
0.06
alist
0.06
acquitted
0.06
Activations Density 0.023%