INDEX
    Explanations

    stereotypes and related terms

    references to stereotypes

    New Auto-Interp
    Negative Logits
    ayan
    -0.79
    inth
    -0.74
    sterdam
    -0.73
    rique
    -0.71
    ateur
    -0.68
    gan
    -0.68
    ighters
    -0.67
    ighth
    -0.67
    light
    -0.66
    packing
    -0.65
    POSITIVE LOGITS
     stereotyp
    1.01
     stereotype
    0.90
     stereotypes
    0.89
    rities
    0.81
     portrayal
    0.80
     depictions
    0.80
     portray
    0.80
    è¦ļéĨĴ
    0.80
     caricature
    0.73
     clich
    0.72
    Act Density 0.018%

    No Known Activations