INDEX
    Explanations

    political figures and government-related terms

    proper nouns, particularly names of people and organizations

    New Auto-Interp
    Negative Logits
    $.
    -0.62
    ".
    -0.58
    }.
    -0.58
     ".
    -0.57
    ''.
    -0.53
    ").
    -0.51
    .).
    -0.50
    .</
    -0.49
    ().
    -0.49
     ).
    -0.48
    POSITIVE LOGITS
     spokesman
    0.69
     spokeswoman
    0.67
     spokesperson
    0.60
     countered
    0.57
     meanwhile
    0.57
     reacted
    0.56
     tweeted
    0.52
     echoed
    0.52
     commented
    0.51
     cautioned
    0.51
    Act Density 1.001%

    No Known Activations