INDEX
Explanations
references to gender-related medical issues and discussions surrounding transgender healthcare
New Auto-Interp
Negative Logits
ichten
-0.15
innitus
-0.14
emoc
-0.14
tron
-0.14
éné
-0.14
domest
-0.13
Interracial
-0.13
Hom
-0.13
åĤ
-0.13
609
-0.13
POSITIVE LOGITS
trans
0.45
transgender
0.42
Trans
0.40
Gender
0.39
gender
0.39
trans
0.37
Trans
0.36
gender
0.35
Gender
0.34
Transition
0.33
Activations Density 0.108%