INDEX
    Explanations

    phrases related to general situations and actions

    themes related to the concepts of violence and societal issues

    New Auto-Interp
    Negative Logits
     Particularly
    -0.78
    particularly
    -0.76
     sidx
    -0.75
     Specifically
    -0.75
     Especially
    -0.71
    arin
    -0.71
    significant
    -0.70
    ourage
    -0.68
    ierre
    -0.67
    especially
    -0.66
    POSITIVE LOGITS
     unaffected
    1.09
     unchanged
    1.08
     irrelevant
    1.06
     harmless
    1.03
     ignored
    1.02
     impunity
    0.99
     alright
    0.99
     unrem
    0.97
     shrugged
    0.96
     indifferent
    0.96
    Act Density 0.576%

    No Known Activations