INDEX
    Explanations

    proper nouns, specifically names related to politics and leadership

    mentions of specific individuals or characters

    New Auto-Interp
    Negative Logits
     Whitman
    -0.70
    Fed
    -0.68
     Sed
    -0.68
     Plum
    -0.65
     Bullets
    -0.65
    Reviewer
    -0.64
    Spread
    -0.64
    REDACTED
    -0.63
     Harriet
    -0.63
     Sergeant
    -0.63
    POSITIVE LOGITS
    emer
    0.79
    ymes
    0.77
    ippers
    0.76
    \\\\
    0.76
    agan
    0.75
    oÄŁ
    0.74
    ark
    0.71
    oos
    0.70
    pid
    0.70
    ille
    0.69
    Act Density 0.032%

    No Known Activations