INDEX
    Explanations

    references to conflict and violence, particularly related to terrorism and illegal settlements

    New Auto-Interp
    Negative Logits
     Mercy
    -0.16
     adm
    -0.15
    Anchor
    -0.15
    aram
    -0.15
    ovnÃŃ
    -0.15
    oples
    -0.14
     kostenlose
    -0.14
     Kemal
    -0.13
    legg
    -0.13
     Dog
    -0.13
    POSITIVE LOGITS
    endra
    0.20
    ż
    0.15
    estroy
    0.14
     Lub
    0.14
     Chore
    0.14
    /ng
    0.13
    ë°ĶìĿ´
    0.13
    adena
    0.13
    riot
    0.13
    寸
    0.13
    Act Density 0.006%

    No Known Activations