INDEX
Explanations
terms related to specific names or words with a strong emphasis or association
terms and phrases related to the LGBTQ+ community
New Auto-Interp
Negative Logits
ngth
-0.80
spoilers
-0.63
âĢ¢âĢ¢
-0.63
76561
-0.63
profession
-0.62
resemb
-0.60
tight
-0.59
ãĥĩ
-0.58
slee
-0.58
deprivation
-0.58
POSITIVE LOGITS
itely
0.93
bush
0.91
ĵĺ
0.89
ilogy
0.86
ilver
0.78
ocally
0.77
arine
0.75
rences
0.75
coat
0.73
ional
0.73
Activations Density 0.069%