INDEX
    Explanations

    phrases related to violence and the act of killing

    New Auto-Interp
    Negative Logits
    esta
    -0.17
    esto
    -0.15
    stanov
    -0.14
    å±
    -0.14
    ãģİ
    -0.14
    acles
    -0.14
    ndern
    -0.14
    itsu
    -0.14
    906
    -0.13
    neider
    -0.13
    POSITIVE LOGITS
    kö
    0.15
    .ToShort
    0.14
    throp
    0.14
    patrick
    0.14
     instincts
    0.14
    abyrin
    0.14
    δÏģο
    0.14
    ábado
    0.14
    /goto
    0.14
     kep
    0.14
    Act Density 0.034%

    No Known Activations