INDEX
    Explanations

    instances of actions involving cutting off body parts

    New Auto-Interp
    Negative Logits
    Export
    -0.71
    hai
    -0.66
    Gall
    -0.65
    MX
    -0.62
    Football
    -0.62
    zilla
    -0.61
    anners
    -0.61
    ĪĴ
    -0.58
    inx
    -0.58
    elsius
    -0.58
    POSITIVE LOGITS
     Broadway
    0.73
    screen
    0.72
    ment
    0.71
    river
    0.68
    shoot
    0.66
    site
    0.66
     leash
    0.66
     dun
    0.65
    cffffcc
    0.65
     shore
    0.64
    Act Density 0.052%

    No Known Activations