INDEX
    Explanations

    the word "raison" or words containing it

    references to the concept of "race" or "racial issues."

    New Auto-Interp
    Negative Logits
    renheit
    -0.90
    izabeth
    -0.84
    rate
    -0.79
    rates
    -0.78
    sburgh
    -0.77
    ledged
    -0.76
    grad
    -0.74
     Darius
    -0.71
    sburg
    -0.70
    ij士
    -0.70
    POSITIVE LOGITS
    posium
    0.80
     sidx
    0.80
    SpaceEngineers
    0.74
    istically
    0.72
    agate
    0.70
    isin
    0.70
    selves
    0.69
     srfAttach
    0.68
    ific
    0.67
    sem
    0.67
    Act Density 0.057%

    No Known Activations