INDEX
    Explanations

    mentions of political figures' last names

    names of political figures and their affiliations

    New Auto-Interp
    Negative Logits
    ossier
    -0.78
    displayText
    -0.77
     myster
    -0.76
     endorsements
    -0.72
     endors
    -0.68
    accompan
    -0.66
     proble
    -0.65
    vana
    -0.64
    ilater
    -0.64
     assistants
    -0.64
    POSITIVE LOGITS
    bre
    0.82
    cone
    0.72
    rame
    0.70
    fruit
    0.69
    Nar
    0.68
    hend
    0.66
    hurst
    0.65
    pload
    0.65
    hoff
    0.63
     Robot
    0.63
    Act Density 0.359%

    No Known Activations