INDEX
Explanations
words related to homosexuality and its discussions
New Auto-Interp
Negative Logits
Diri
-0.81
Skocz
-0.80
carina
-0.79
Taka
-0.77
Johnnie
-0.76
Infórmanos
-0.73
Dort
-0.72
cair
-0.71
Taka
-0.71
ítulo
-0.70
POSITIVE LOGITS
Hom
1.44
hom
1.38
Hom
1.34
hom
1.25
homogen
1.02
HOM
0.96
HOM
0.96
homosexual
0.88
homog
0.81
homophobic
0.81
Activations Density 0.012%