INDEX
    Explanations

    themes related to injustice and discrimination

    New Auto-Interp
    Negative Logits
     -
    -0.17
     Bale
    -0.16
     &
    -0.16
     seperate
    -0.16
     [&
    -0.15
    egin
    -0.15
    -0.15
     signalling
    -0.15
     bod
    -0.15
     seper
    -0.14
    POSITIVE LOGITS
    ez
    0.17
    embre
    0.17
    esser
    0.15
    ç°
    0.15
     Usa
    0.15
     pushViewController
    0.15
    azı
    0.15
     mấy
    0.14
    eman
    0.14
     negoci
    0.14
    Act Density 0.001%

    No Known Activations