INDEX
    Explanations

    mentions of political figures, especially negative references and critiques

    occurrences of the word "and."

    New Auto-Interp
    Negative Logits
    inarily
    -0.76
    ridges
    -0.68
    note
    -0.67
    Contents
    -0.65
    culus
    -0.65
    floor
    -0.65
    physical
    -0.64
    different
    -0.64
    IED
    -0.64
    ENS
    -0.63
    POSITIVE LOGITS
     Associates
    0.90
     vice
    0.83
     others
    0.77
    ERSON
    0.76
     Sons
    0.74
     associates
    0.73
     assorted
    0.71
     consequently
    0.69
     other
    0.69
     then
    0.68
    Act Density 0.336%

    No Known Activations