INDEX
    Explanations

    themes and vocabulary related to violence and brutality

    New Auto-Interp
    Negative Logits
    resent
    -0.14
     underst
    -0.14
     Monte
    -0.14
    erea
    -0.13
    bout
    -0.13
    gree
    -0.13
    setup
    -0.13
     Hyp
    -0.13
    á»ķng
    -0.13
    refund
    -0.13
    POSITIVE LOGITS
    ccione
    0.16
    auer
    0.15
    ifax
    0.15
    /rss
    0.15
    á»įt
    0.14
    ÙĤØ·
    0.14
    PILE
    0.14
    огод
    0.14
    å¯
    0.14
    arkan
    0.14
    Act Density 0.376%

    No Known Activations