INDEX
    Explanations

    phrases related to physical violence or aggressive behavior

    mentions of the word "beat."

    New Auto-Interp
    Negative Logits
    asel
    -0.68
    ateral
    -0.67
     condem
    -0.66
    orrow
    -0.65
    Import
    -0.64
     pard
    -0.62
    isse
    -0.61
    BuyableInstoreAndOnline
    -0.61
    OPLE
    -0.61
    amera
    -0.60
    POSITIVE LOGITS
    rice
    1.14
    beat
    1.08
    down
    1.07
    boxing
    0.99
    downs
    0.97
    nik
    0.94
    tle
    0.93
    ework
    0.91
    ings
    0.89
    hered
    0.87
    Act Density 0.045%

    No Known Activations