INDEX
    Explanations

    instances of the word "groping" and similar variations relating to unwanted physical contact

    New Auto-Interp
    Negative Logits
     tune
    -0.88
     Tune
    -0.70
    REE
    -0.69
    senal
    -0.69
    SO
    -0.68
    Effective
    -0.68
    VID
    -0.68
    VIS
    -0.67
    DIT
    -0.66
     ministry
    -0.65
    POSITIVE LOGITS
     grop
    1.12
    ingly
    0.89
    ing
    0.89
    atted
    0.89
    ured
    0.87
    estation
    0.86
    ographs
    0.85
    eties
    0.85
    raped
    0.85
    ating
    0.84
    Act Density 0.006%

    No Known Activations