INDEX
    Explanations

    mentions of the word "Trump."

    repeated mentions of the word "trumps" and other related variations

    New Auto-Interp
    Negative Logits
    raq
    -0.84
    ANC
    -0.84
    GAN
    -0.79
    ANCE
    -0.77
    zai
    -0.76
     srfAttach
    -0.75
    RW
    -0.72
    ANA
    -0.71
    BIL
    -0.70
    WORK
    -0.70
    POSITIVE LOGITS
    hift
    1.00
    manship
    1.00
    paces
    0.95
    pace
    0.87
    ilver
    0.85
    peed
    0.84
    poons
    0.82
    oulos
    0.82
    uits
    0.79
    hops
    0.79
    Act Density 0.028%

    No Known Activations