INDEX
    Explanations

    phrases related to political criticism and opposition

    references to political controversy and public outrage

    New Auto-Interp
    Negative Logits
     prepar
    -0.71
    eday
    -0.71
     [|
    -0.70
     depended
    -0.68
     ~/
    -0.68
     MAP
    -0.67
     Intermediate
    -0.66
     000
    -0.64
     [(
    -0.64
    8000
    -0.64
    POSITIVE LOGITS
     misogyny
    1.51
     sexism
    1.47
     sexist
    1.43
     misogyn
    1.39
     homophobia
    1.36
     homophobic
    1.33
     scandals
    1.27
     bigotry
    1.27
     hypocrisy
    1.24
     racism
    1.22
    Act Density 1.144%

    No Known Activations