INDEX
Explanations
words related to the LGBTQ+ community, especially terms related to homosexuality
references to homosexuality and related topics
New Auto-Interp
Negative Logits
UGE
-0.77
shroud
-0.71
ãģį
-0.68
hower
-0.67
FUL
-0.66
canopy
-0.66
sender
-0.63
indirect
-0.61
FORE
-0.61
AMS
-0.61
POSITIVE LOGITS
estead
1.43
emade
1.36
ework
1.19
osexual
1.12
eless
1.09
ogeneous
1.04
icide
1.04
ogenous
1.03
neys
1.02
ocaust
1.01
Activations Density 0.009%