INDEX
    Explanations

    mentions of prominent political figures, particularly Donald Trump and George W. Bush

    New Auto-Interp
    Negative Logits
    jad
    -0.16
    .Promise
    -0.16
    uer
    -0.15
    ron
    -0.15
    xis
    -0.14
    jos
    -0.14
    istrib
    -0.14
    atis
    -0.14
    jav
    -0.14
    onth
    -0.14
    POSITIVE LOGITS
     Tome
    0.16
    mpar
    0.15
     Buckley
    0.14
    ç¿Ķ
    0.14
    psilon
    0.14
     Exercise
    0.14
    eki
    0.13
    áÄį
    0.13
    cles
    0.13
    antry
    0.13
    Act Density 0.116%

    No Known Activations