INDEX
    Explanations

    expressions and references related to political conflicts and societal issues

    New Auto-Interp
    Negative Logits
    ãĤ·ãĥ¼
    -0.07
    Äįel
    -0.07
    ippers
    -0.07
    ambique
    -0.07
    anza
    -0.07
     tslint
    -0.06
    ume
    -0.06
     Kit
    -0.06
     Kits
    -0.06
    ITO
    -0.06
    POSITIVE LOGITS
     self
    0.13
    Self
    0.11
     Self
    0.11
     sab
    0.10
    self
    0.10
    -self
    0.09
     sabot
    0.09
    .self
    0.09
     Sab
    0.09
     destruction
    0.09
    Act Density 0.057%

    No Known Activations