INDEX
    Explanations

    references to anti-Government or anti-establishment sentiments

    New Auto-Interp
    Negative Logits
     stead
    -0.74
    è¯
    -0.71
    hips
    -0.70
    ourced
    -0.69
    bos
    -0.66
     therap
    -0.65
    theless
    -0.64
    anyl
    -0.64
    antically
    -0.63
     externalToEVAOnly
    -0.63
    POSITIVE LOGITS
    Fa
    0.90
    strate
    0.80
    zac
    0.77
    war
    0.77
    hero
    0.75
    hesis
    0.73
    Monitor
    0.72
     Dhabi
    0.71
    Age
    0.69
    abuse
    0.69
    Act Density 0.006%

    No Known Activations