INDEX
    Explanations

    terms related to sexual conduct and harassment

    New Auto-Interp
    Negative Logits
    saraba
    -0.62
    IsMutable
    -0.62
    UserScript
    -0.61
     مرئيه
    -0.60
     sex
    -0.59
     Sex
    -0.58
    ThroughAttribute
    -0.58
    sex
    -0.57
     sexual
    -0.57
    كويكب
    -0.55
    POSITIVE LOGITS
     assault
    0.61
    pyx
    0.58
     Assault
    0.56
     minorities
    0.56
     orientation
    0.53
     battery
    0.53
     offenders
    0.52
    isierte
    0.52
     Orientation
    0.51
     Battery
    0.50
    Act Density 0.239%

    No Known Activations