INDEX
    Explanations

    phrases related to public figures and their actions, particularly in a political context

    phrases related to claims and statements made by individuals, particularly in a political context

    New Auto-Interp
    Negative Logits
    agar
    -0.79
    td
    -0.70
    allery
    -0.69
    udic
    -0.68
    ason
    -0.67
    arist
    -0.66
    neau
    -0.66
    ept
    -0.66
    central
    -0.65
     Brill
    -0.65
    POSITIVE LOGITS
     himself
    0.85
     surrog
    0.84
     lewd
    0.80
     retweet
    0.77
     tweeting
    0.75
     Tonight
    0.74
    gyn
    0.73
     onstage
    0.73
     sarcast
    0.72
     insults
    0.71
    Act Density 0.478%

    No Known Activations