INDEX
    Explanations

    references to murder and related violent crimes

    New Auto-Interp
    Negative Logits
    laÅŁ
    -0.15
    lage
    -0.14
    å·ŀ
    -0.14
    izu
    -0.14
    itzer
    -0.14
    ERA
    -0.14
    suming
    -0.14
    åĦ¿
    -0.14
    uplic
    -0.14
    stanov
    -0.13
    POSITIVE LOGITS
    -death
    0.16
    -su
    0.15
    anova
    0.15
    ously
    0.15
    ous
    0.15
     spree
    0.14
     Scenes
    0.14
    stk
    0.14
    auer
    0.14
    greg
    0.14
    Act Density 0.021%

    No Known Activations