INDEX
    Explanations

    references to violence and brutality, particularly in relation to historical events

    New Auto-Interp
    Negative Logits
     Mint
    -0.16
    378
    -0.15
    ços
    -0.14
    åĽº
    -0.13
    çĹ
    -0.13
    ÙĨØ´
    -0.13
    enco
    -0.13
    ovel
    -0.13
    379
    -0.13
     ì¢
    -0.13
    POSITIVE LOGITS
     dec
    0.31
     severed
    0.29
     cuts
    0.28
    åĪĩ
    0.27
     dissect
    0.25
     cut
    0.25
     ÙĤطع
    0.25
     hacked
    0.24
     mutil
    0.24
     sever
    0.24
    Act Density 0.228%

    No Known Activations