INDEX
    Explanations

    terms related to civilian casualties and the implications of warfare

    New Auto-Interp
    Negative Logits
    .Expressions
    -0.16
    Å¡tÄĽ
    -0.16
    ibil
    -0.15
     Riot
    -0.15
     Bullet
    -0.15
    ιÏĩ
    -0.15
    å¼ĥ
    -0.14
     mand
    -0.14
    Href
    -0.14
    à¤ľà¤°
    -0.14
    POSITIVE LOGITS
     strikes
    0.44
     strike
    0.39
     Strikes
    0.36
     Strike
    0.33
     airstrikes
    0.31
     airst
    0.31
    Strike
    0.29
     bombing
    0.29
    strike
    0.29
    _strike
    0.28
    Act Density 0.066%

    No Known Activations