INDEX
Negative Logits
misogyn
-0.07
unaffected
-0.07
condition
-0.07
ولد
-0.06
taskId
-0.06
Animator
-0.06
=is
-0.06
STREAM
-0.06
WithIdentifier
-0.06
-Clause
-0.06
POSITIVE LOGITS
enrol
0.07
_ADDR
0.07
DONE
0.07
deprivation
0.06
determination
0.06
WAS
0.06
LIGHT
0.06
Adjust
0.06
gone
0.06
λογ
0.06
Activations Density 0.021%