INDEX
    Explanations

    the word “black” when referring to race or African‐American identity.

    New Auto-Interp
    Negative Logits
    Defs
    -0.07
    .Expressions
    -0.07
     Nan
    -0.07
     guitarist
    -0.07
    -0.07
     även
    -0.07
     pregunta
    -0.07
     Prophet
    -0.06
     Cash
    -0.06
    ush
    -0.06
    POSITIVE LOGITS
     black
    0.08
     shake
    0.07
     Black
    0.07
     chunks
    0.07
    0.07
     enorm
    0.06
     ())
    0.06
    black
    0.06
    .".
    0.06
     premature
    0.06
    Act Density 0.010%

    No Known Activations