INDEX
    Explanations

    references to race and racial issues.

    New Auto-Interp
    Negative Logits
     racing
    -0.16
     Racing
    -0.16
     races
    -0.16
    .metamodel
    -0.16
    iram
    -0.15
    IRA
    -0.14
     Baz
    -0.14
    ares
    -0.14
    inition
    -0.14
    eling
    -0.14
    POSITIVE LOGITS
     profiling
    0.26
    /color
    0.21
    -prof
    0.21
     cleansing
    0.21
     Cleans
    0.19
    ized
    0.19
     profiler
    0.18
     pride
    0.18
     harmony
    0.18
     epith
    0.18
    Act Density 0.033%

    No Known Activations