INDEX
    Explanations

    proper names or words related to people, likely politician or public figures

    references to individuals and their relationships or actions

    New Auto-Interp
    Negative Logits
    */(
    -0.82
     Agents
    -0.74
    hold
    -0.74
    soDeliveryDate
    -0.72
     Cosponsors
    -0.72
    IAL
    -0.72
     message
    -0.69
    rab
    -0.67
    rants
    -0.67
    boards
    -0.65
    POSITIVE LOGITS
    oli
    1.41
    ague
    1.07
    olini
    0.98
    veyard
    0.93
    zzo
    0.92
    omon
    0.90
    oglu
    0.89
    ppo
    0.89
    uca
    0.88
    ola
    0.87
    Act Density 0.005%

    No Known Activations