INDEX
    Explanations

    references to stabbings and knives

    New Auto-Interp
    Negative Logits
    ostante
    -0.72
    ưng
    -0.69
     Pele
    -0.66
     grand
    -0.66
     tot
    -0.65
     Bial
    -0.64
    ond
    -0.63
     thảo
    -0.63
     drey
    -0.63
    grand
    -0.63
    POSITIVE LOGITS
     knife
    1.90
     knives
    1.84
     Knife
    1.81
     Knives
    1.79
    knife
    1.74
    Knife
    1.69
    Kni
    1.39
     coute
    1.25
     Kni
    1.23
    kni
    1.16
    Act Density 0.006%

    No Known Activations