INDEX
Explanations
discussions related to discrimination and civil rights issues
New Auto-Interp
Negative Logits
à¤ľà¤°
-0.15
Auction
-0.14
akov
-0.14
uco
-0.14
orz
-0.14
xBC
-0.14
errupted
-0.13
_Offset
-0.13
actories
-0.13
_IMPORTED
-0.13
POSITIVE LOGITS
protections
0.33
bathroom
0.33
transgender
0.33
discrimination
0.31
Bathroom
0.29
bathrooms
0.29
restroom
0.29
nond
0.28
protection
0.26
Discrim
0.24
Activations Density 0.043%