INDEX
Explanations
references to gender identity and related topics like restrooms, facilities, and locker rooms
references to gender identity and related policies
New Auto-Interp
Negative Logits
thunder
-0.73
Byr
-0.69
uncture
-0.67
Exc
-0.67
Winc
-0.66
baugh
-0.66
Prices
-0.65
Money
-0.65
UF
-0.64
Ley
-0.63
POSITIVE LOGITS
iannopoulos
0.88
pronouns
0.84
disorder
0.79
aki
0.78
identity
0.75
dysph
0.74
transitioned
0.73
Identity
0.72
Disorders
0.71
ference
0.70
Activations Density 0.091%