INDEX
    Explanations

    mentions of politicians

    references to politicians and political figures

    New Auto-Interp
    Negative Logits
    urious
    -0.74
    actory
    -0.73
    ventory
    -0.72
    uran
    -0.68
    gged
    -0.67
    wered
    -0.65
     Cancel
    -0.64
    DEP
    -0.64
    uras
    -0.63
     Condition
    -0.63
    POSITIVE LOGITS
    clinton
    1.08
     appoint
    0.82
    hips
    0.82
     correctness
    0.76
    icians
    0.74
     impe
    0.68
    woman
    0.68
     junk
    0.67
     elected
    0.67
    hip
    0.67
    Act Density 0.030%

    No Known Activations