INDEX
    Explanations

    mentions of Israel and its military actions

    New Auto-Interp
    Negative Logits
    bury
    -0.17
    antu
    -0.16
    ogle
    -0.16
    utsche
    -0.15
    .TR
    -0.15
    ìĦľëĬĶ
    -0.15
    chten
    -0.15
    باÙĦ
    -0.15
    ansom
    -0.15
    .jet
    -0.14
    POSITIVE LOGITS
    608
    0.17
    å¡
    0.16
    hone
    0.15
    AVED
    0.14
    ancel
    0.14
    itet
    0.14
    /lang
    0.13
    LD
    0.13
     Shank
    0.13
     Solic
    0.13
    Act Density 0.012%

    No Known Activations