INDEX
    Explanations

    words related to exclusion or being excluded

    instances of the word "exclude" and its variations, indicating a focus on exclusionary practices or policies

    New Auto-Interp
    Negative Logits
    ingham
    -0.84
    oÄŁ
    -0.78
    Found
    -0.77
     stead
    -0.75
    nan
    -0.69
    ebus
    -0.67
    vous
    -0.67
     deed
    -0.66
    des
    -0.66
    ̶
    -0.66
    POSITIVE LOGITS
     excluding
    0.81
     spoilers
    0.78
     bystanders
    0.76
    ively
    0.76
     exclude
    0.75
     excluded
    0.75
     excludes
    0.73
     minded
    0.73
     prejudice
    0.67
     spo
    0.64
    Act Density 0.011%

    No Known Activations