INDEX
    Explanations

    violent actions or descriptions

    phrases that express negative emotions or actions

    New Auto-Interp
    Negative Logits
    icipated
    -0.82
     authorised
    -0.72
    andum
    -0.71
    psey
    -0.70
     envis
    -0.69
     Honour
    -0.68
    ndum
    -0.68
     undertaken
    -0.68
     fulfil
    -0.66
     commenced
    -0.65
    POSITIVE LOGITS
     stuff
    0.79
     crap
    0.74
     shit
    0.73
     weird
    0.71
    Kids
    0.67
     Crazy
    0.66
     Creep
    0.66
     garbage
    0.66
     kinda
    0.65
     dude
    0.64
    Act Density 1.821%

    No Known Activations