INDEX
    Explanations

    topics related to stereotypes and generalizations, particularly about gender and race

    New Auto-Interp
    Negative Logits
     BorderSide
    -0.35
     edile
    -0.35
    ably
    -0.34
    lotl
    -0.34
    SaveChangesAsync
    -0.34
    ModelSerializer
    -0.34
    animous
    -0.33
     blotting
    -0.33
    wiście
    -0.33
     amicable
    -0.33
    POSITIVE LOGITS
     stereotypes
    0.71
     stereotype
    0.68
     myſelf
    0.65
     Monfieur
    0.58
     pigeon
    0.58
     stereotyp
    0.56
     Anſ
    0.56
     prejudices
    0.56
    ſelves
    0.52
    pigeon
    0.52
    Act Density 0.086%

    No Known Activations