INDEX
Negative Logits
hesitation
-0.07
promotions
-0.06
Processor
-0.06
Restaurant
-0.06
Exterior
-0.06
=d
-0.06
Christoph
-0.06
Babies
-0.06
(original
-0.06
06
-0.06
POSITIVE LOGITS
men
0.09
/use
0.08
szer
0.08
men
0.07
Men
0.07
menn
0.07
alom
0.07
MEN
0.07
Men
0.07
ุษ
0.07
Activations Density 0.023%