INDEX
Negative Logits
rgb
-0.08
organizer
-0.08
Organizer
-0.08
्यान
-0.08
Recon
-0.07
reú
-0.07
recon
-0.07
woo
-0.07
سط
-0.07
dedica
-0.07
POSITIVE LOGITS
зл
0.08
सीमा
0.08
违规
0.08
незакон
0.07
violated
0.07
violating
0.07
Boundary
0.07
undesirable
0.07
schicken
0.07
boundary
0.07
Activations Density 0.003%