INDEX
    Explanations

    references to violence and its various contexts

    New Auto-Interp
    Negative Logits
    leanup
    -0.16
    CAST
    -0.16
    alle
    -0.16
    roud
    -0.15
    /stdc
    -0.15
    owitz
    -0.15
    elian
    -0.15
    Ñıг
    -0.14
    689
    -0.14
    chg
    -0.14
    POSITIVE LOGITS
    vens
    0.20
    -force
    0.18
     force
    0.17
    essel
    0.16
    /angular
    0.15
    force
    0.15
    adier
    0.15
     Force
    0.14
    339
    0.14
    olson
    0.14
    Act Density 0.019%

    No Known Activations