INDEX
    Explanations

    phrases related to killing and violence

    New Auto-Interp
    Negative Logits
    esto
    -0.16
    å±
    -0.16
    igo
    -0.15
    orial
    -0.15
    ted
    -0.14
    906
    -0.14
    acles
    -0.14
    ential
    -0.14
    838
    -0.14
    olina
    -0.14
    POSITIVE LOGITS
    ábado
    0.19
    throp
    0.15
    abyrin
    0.14
    icie
    0.14
     Dim
    0.13
    iciel
    0.13
    ifestyles
    0.13
     Jennings
    0.13
    ourg
    0.13
    plotlib
    0.13
    Act Density 0.040%

    No Known Activations