INDEX
    Explanations

    references to adult content, specifically focusing on the presence of pornography

    New Auto-Interp
    Negative Logits
    å§«
    -0.78
    soType
    -0.68
     MacArthur
    -0.67
    IELD
    -0.67
     defe
    -0.66
     Brewer
    -0.65
     Quin
    -0.65
    pring
    -0.64
    externalActionCode
    -0.64
     Fol
    -0.63
    POSITIVE LOGITS
    ographers
    1.26
    ographer
    1.10
    hub
    0.97
    ographically
    0.97
    ography
    0.97
    ographic
    0.92
     pornography
    0.89
     Porn
    0.85
    stars
    0.84
    OGR
    0.82
    Act Density 0.021%

    No Known Activations