INDEX
Negative Logits
"It
-0.07
\""
-0.07
-it
-0.07
icolor
-0.06
Beer
-0.06
Kit
-0.06
_Set
-0.06
jit
-0.06
lier
-0.06
Plot
-0.06
POSITIVE LOGITS
Trans
0.15
Trans
0.15
trans
0.14
trans
0.11
.trans
0.11
.Trans
0.10
trans
0.10
-trans
0.10
_trans
0.10
transgender
0.10
Activations Density 0.021%