INDEX
Explanations
locations of the term "transgender" related to discussions on bathrooms, identity, and legislation
New Auto-Interp
Negative Logits
BLIC
-0.87
amina
-0.83
iries
-0.80
Warrant
-0.78
Interstitial
-0.76
Gerr
-0.75
Showtime
-0.74
baugh
-0.74
GOODMAN
-0.74
Office
-0.74
POSITIVE LOGITS
dysph
1.30
pronouns
1.01
Equality
0.99
identity
0.98
fuck
0.98
equality
0.97
bender
0.97
imbalance
0.96
endered
0.95
stereotypes
0.90
Activations Density 8.488%