INDEX
    Explanations

    references to social justice and advocacy

    New Auto-Interp
    Negative Logits
    amins
    -0.15
     sesso
    -0.14
    iego
    -0.14
    onaut
    -0.14
    롱
    -0.13
    _CSR
    -0.13
    ashire
    -0.13
    -Americ
    -0.13
    chner
    -0.13
    NewItem
    -0.13
    POSITIVE LOGITS
     these
    0.30
     them
    0.29
    è¿ĻäºĽ
    0.25
    these
    0.24
    These
    0.22
     THESE
    0.22
     These
    0.20
     ÑįÑĤиÑħ
    0.20
     Them
    0.20
     tÄĽchto
    0.19
    Act Density 0.355%

    No Known Activations