INDEX
    Explanations

    words related to challenging or dispelling stereotypes and myths

    references to stereotypes and discussions about their implications

    New Auto-Interp
    Negative Logits
    ea
    -0.79
     avail
    -0.74
     Liquid
    -0.71
     liquid
    -0.68
    ener
    -0.66
     Aur
    -0.65
     pending
    -0.65
    live
    -0.64
    pload
    -0.64
     authorized
    -0.62
    POSITIVE LOGITS
     stereotypes
    3.57
     stereotype
    3.47
     stereotyp
    2.67
     stereotypical
    2.66
     clich
    2.09
     caricature
    1.94
     tropes
    1.93
     misconceptions
    1.78
     prejudices
    1.71
     cliché
    1.71
    Act Density 0.026%

    No Known Activations