INDEX
    Explanations

    references to political figures and their statements or actions

    New Auto-Interp
    Negative Logits
    affer
    -0.17
    agen
    -0.17
    .localized
    -0.16
    Anchor
    -0.15
    ulg
    -0.15
     analyzes
    -0.14
    mention
    -0.14
     gaps
    -0.14
     Suit
    -0.14
    ocol
    -0.14
    POSITIVE LOGITS
     yesterday
    0.23
     flag
    0.22
     today
    0.21
     rub
    0.20
     reveal
    0.19
     tonight
    0.18
     revealing
    0.18
     welcoming
    0.18
     robust
    0.18
    Yesterday
    0.17
    Act Density 0.149%

    No Known Activations