INDEX
Negative Logits
radioButton
-0.07
rea
-0.07
category
-0.07
ratios
-0.07
vulgar
-0.07
jogging
-0.07
Paula
-0.07
anomaly
-0.06
auss
-0.06
Ade
-0.06
POSITIVE LOGITS
Ship
0.08
ships
0.07
ك
0.07
필
0.07
zip
0.07
Ship
0.07
ship
0.07
ап
0.07
Ships
0.07
ipped
0.07
Activations Density 0.013%