INDEX
    Explanations

    profane language

    expressions of frustration or disdain towards things perceived as nonsensical or worthless

    New Auto-Interp
    Negative Logits
    hip
    -0.92
    sole
    -0.75
    vim
    -0.75
    significant
    -0.73
    versible
    -0.73
    lez
    -0.69
    ugal
    -0.69
    expression
    -0.69
    rez
    -0.69
    tein
    -0.69
    POSITIVE LOGITS
     crap
    1.05
     bullshit
    0.98
     BS
    0.98
     blah
    0.94
     nonsense
    0.93
     rubbish
    0.92
     excuse
    0.89
     excuses
    0.81
     Jindal
    0.77
    ocr
    0.70
    Act Density 0.006%

    No Known Activations