INDEX
Explanations
words related to gender identity and political/social controversy
terms related to gender identity and its implications in society
New Auto-Interp
Negative Logits
Sequ
-0.75
explan
-0.68
obser
-0.65
tremend
-0.64
newcom
-0.62
cknowled
-0.62
millenn
-0.62
reconc
-0.61
Weak
-0.59
repre
-0.59
POSITIVE LOGITS
anymore
0.83
.
0.83
.''.
0.77
whatsoever
0.70
or
0.70
.[
0.70
*.
0.68
.,"
0.68
Getty
0.67
ãĢĤ
0.63
Activations Density 0.580%