INDEX
    Explanations

    mentions of violent attacks and their consequences

    New Auto-Interp
    Negative Logits
    lean
    -0.17
    onium
    -0.15
    agini
    -0.15
    zim
    -0.14
    agine
    -0.14
    umi
    -0.14
    Īĺ
    -0.14
    rana
    -0.14
    mins
    -0.14
     compliment
    -0.14
    POSITIVE LOGITS
    odyn
    0.15
    iya
    0.15
     Yard
    0.15
    etÃŃ
    0.14
    eks
    0.14
    aleb
    0.14
     Kick
    0.13
     khúc
    0.13
    дел
    0.13
     latest
    0.13
    Act Density 0.047%

    No Known Activations