INDEX
    Explanations

    mentions of the White House

    mentions of the White House

    New Auto-Interp
    Negative Logits
    ITAL
    -0.83
    odcast
    -0.77
    raints
    -0.72
    orsi
    -0.69
    SIGN
    -0.67
    Occup
    -0.67
    ENDED
    -0.66
    tics
    -0.66
    olls
    -0.65
     WATCHED
    -0.65
    POSITIVE LOGITS
    caps
    1.07
     White
    1.04
    berry
    0.96
    house
    0.93
     supremacist
    0.92
     supremacists
    0.91
    hall
    0.91
     suprem
    0.90
    zee
    0.83
     Sox
    0.82
    Act Density 0.014%

    No Known Activations