INDEX
Negative Logits
September
-0.06
essment
-0.06
assed
-0.06
platinum
-0.06
.collection
-0.06
Breitbart
-0.06
flushing
-0.06
awy
-0.06
Sharia
-0.06
çocuk
-0.06
POSITIVE LOGITS
_RULE
0.07
++↵↵
0.06
sne
0.06
_ctrl
0.06
}) ↵ ↵
0.06
tra
0.06
_P
0.06
Requires
0.06
mination
0.06
mathrm
0.06
Activations Density 0.015%