INDEX
    Explanations

    references to social and political groups, particularly focusing on issues affecting various communities

    New Auto-Interp
    Negative Logits
    lessly
    -0.22
    ings
    -0.19
    adoo
    -0.17
    usc
    -0.15
    TURE
    -0.15
    ful
    -0.14
    lessness
    -0.14
    oner
    -0.13
    less
    -0.13
    ively
    -0.13
    POSITIVE LOGITS
    -American
    0.23
    -Americans
    0.20
    /Linux
    0.19
    -Benz
    0.19
    /OR
    0.19
    .gov
    0.18
    /AIDS
    0.18
    /mac
    0.17
    berger
    0.16
    /MIT
    0.16
    Act Density 0.289%

    No Known Activations