INDEX
Explanations
mentions of the LGBT community
references to homosexuality and related topics
New Auto-Interp
Negative Logits
ãģį
-0.78
shroud
-0.68
UGE
-0.65
BuyableInstoreAndOnline
-0.65
Krug
-0.65
Delta
-0.62
dent
-0.62
REE
-0.62
FUL
-0.61
EMENT
-0.61
POSITIVE LOGITS
emade
1.34
estead
1.34
neys
1.06
ocaust
1.04
onym
1.01
ogeneous
1.01
eless
1.00
osexual
1.00
ework
0.99
icide
0.98
Activations Density 0.003%