INDEX
    Explanations

    names of political figures or entities

    references to prominent political figures and their statements

    New Auto-Interp
    Negative Logits
    !.
    -0.74
    .�
    -0.70
    }.
    -0.69
    .''
    -0.67
    .$
    -0.63
    .ãĢį
    -0.62
    ''.
    -0.61
    .--
    -0.60
     sqor
    -0.58
    utterstock
    -0.58
    POSITIVE LOGITS
     hadn
    0.89
     should
    0.85
     had
    0.83
     shouldn
    0.78
     lacked
    0.77
     discriminated
    0.74
     lacks
    0.74
     could
    0.72
     violated
    0.71
     behaved
    0.69
    Act Density 0.893%

    No Known Activations