INDEX
    Explanations

    instances of words related to colors

    references to racial and cultural identifiers or stereotypes

    New Auto-Interp
    Negative Logits
    igious
    -0.72
    FUN
    -0.67
    Services
    -0.66
    izarre
    -0.66
    atis
    -0.65
    Internal
    -0.64
    Effective
    -0.64
    Specific
    -0.64
    bably
    -0.64
    ãĤ¡
    -0.63
    POSITIVE LOGITS
     stripes
    0.92
     striped
    0.80
     stripe
    0.79
    oxide
    0.77
     flakes
    0.76
     Metallic
    0.74
     syndrome
    0.73
    cloth
    0.73
     stain
    0.71
    berries
    0.70
    Act Density 0.303%

    No Known Activations