INDEX
Explanations
references to adult content, specifically focusing on the presence of pornography
mentions of pornography
New Auto-Interp
Negative Logits
å§«
-0.78
soType
-0.68
MacArthur
-0.67
IELD
-0.67
defe
-0.66
Brewer
-0.65
Quin
-0.65
pring
-0.64
externalActionCode
-0.64
Fol
-0.63
POSITIVE LOGITS
ographers
1.26
ographer
1.10
hub
0.97
ographically
0.97
ography
0.97
ographic
0.92
pornography
0.89
Porn
0.85
stars
0.84
OGR
0.82
Activations Density 0.021%