INDEX
    Explanations

    phrases related to harassment and intimidation

    New Auto-Interp
    Negative Logits
    éĹĺ
    -0.95
    inet
    -0.78
    ethe
    -0.77
    iets
    -0.77
    stanbul
    -0.76
     Wonders
    -0.74
    essential
    -0.73
    swick
    -0.72
    ACTED
    -0.72
    rient
    -0.71
    POSITIVE LOGITS
     harassment
    0.98
     accus
    0.96
     harassing
    0.96
     harass
    0.95
     tactics
    0.87
    assment
    0.86
     allegations
    0.86
     stalking
    0.85
     leveled
    0.84
     accusations
    0.84
    Act Density 0.043%

    No Known Activations