INDEX
    Explanations

    mentions of the White House

    mentions of the White House

    New Auto-Interp
    Negative Logits
    ITAL
    -0.85
    odcast
    -0.75
    raints
    -0.74
    orsi
    -0.73
    tics
    -0.71
    trak
    -0.70
     WATCHED
    -0.69
    ModLoader
    -0.68
    olls
    -0.66
    rg
    -0.65
    POSITIVE LOGITS
    berry
    0.95
    caps
    0.95
    house
    0.90
    hall
    0.88
     White
    0.88
     suprem
    0.83
    zee
    0.82
     supremacist
    0.81
     Sox
    0.80
    horse
    0.79
    Act Density 0.015%

    No Known Activations