INDEX
    Explanations

    situations involving workplace discrimination and harassment allegations

    New Auto-Interp
    Negative Logits
    parsedMessage
    -0.70
    TestingModule
    -0.63
    aarrggbb
    -0.62
     desmotivaciones
    -0.60
     nakalista
    -0.59
     avoient
    -0.58
     defaultstate
    -0.56
    Filmografie
    -0.56
     ModelExpression
    -0.54
    lewati
    -0.53
    POSITIVE LOGITS
    0.41
     Krim
    0.40
     span
    0.39
     spans
    0.39
    ✨:
    0.39
     Kram
    0.39
    ̩
    0.38
    span
    0.37
    (!)
    0.35
    /******/
    0.34
    Act Density 0.899%

    No Known Activations