INDEX
    Explanations

    phrases related to causing harm or destruction

    references to acts of killing

    New Auto-Interp
    Negative Logits
     Collider
    -0.76
    rial
    -0.73
    BuyableInstoreAndOnline
    -0.73
    umn
    -0.67
    Grad
    -0.66
     concess
    -0.65
    arity
    -0.65
    anwhile
    -0.64
    provided
    -0.62
     Celest
    -0.61
    POSITIVE LOGITS
     spree
    0.89
    icides
    0.88
    houses
    0.82
    killer
    0.81
     kill
    0.79
     killing
    0.79
    joy
    0.78
    switch
    0.78
    icide
    0.76
     killers
    0.75
    Act Density 0.025%

    No Known Activations