INDEX
    Explanations

    mentions of negative political situations or criticisms

    New Auto-Interp
    Negative Logits
    secut
    -0.74
    CVE
    -0.70
     simultane
    -0.69
    ocent
    -0.68
     attentive
    -0.66
     uncond
    -0.63
    SPONSORED
    -0.63
     amen
    -0.60
    etheless
    -0.60
     relativity
    -0.60
    POSITIVE LOGITS
    yard
    1.06
    yards
    0.94
    shaw
    0.93
    hire
    0.89
    books
    0.88
    book
    0.87
    aper
    0.85
    enhagen
    0.85
    herer
    0.83
    TING
    0.83
    Act Density 0.022%

    No Known Activations