INDEX
Explanations
words related to adult content, specifically the word "porn"
references to pornography and related terms
New Auto-Interp
Negative Logits
IELD
-0.85
WAYS
-0.83
âĸ¬
-0.82
å§«
-0.76
Quin
-0.71
IFE
-0.70
Brewer
-0.69
COUR
-0.66
externalActionCode
-0.65
âĸ¬âĸ¬
-0.65
POSITIVE LOGITS
ographers
1.03
ographer
0.95
hub
0.92
pornography
0.90
ographically
0.89
porn
0.85
stars
0.84
asus
0.83
ographic
0.80
ography
0.78
Activations Density 0.012%