INDEX
Negative Logits
dividing
-0.07
walls
-0.07
department
-0.07
premature
-0.06
praised
-0.06
sean
-0.06
sts
-0.06
clicking
-0.06
business
-0.06
перен
-0.06
POSITIVE LOGITS
ói
0.07
.”
0.06
탁
0.06
-lnd
0.06
,全
0.06
roman
0.06
Every
0.06
exter
0.06
(any
0.06
.Roll
0.06
Activations Density 0.001%