INDEX
    Explanations

    mentions of demographics and societal issues like race, gender, inequality, and politics

    New Auto-Interp
    Negative Logits
     externalToEVAOnly
    -0.68
     centrif
    -0.67
    Downloadha
    -0.61
     ridic
    -0.60
    parser
    -0.60
     sidebar
    -0.60
     ingred
    -0.59
     freeing
    -0.59
     sshd
    -0.57
     FTA
    -0.57
    POSITIVE LOGITS
    course
    1.03
    icial
    1.00
     stature
    0.99
     course
    0.89
    ortunately
    0.88
     origin
    0.84
    oubted
    0.81
     whom
    0.81
    interest
    0.79
     prominence
    0.79
    Act Density 0.131%

    No Known Activations