INDEX
    Explanations

    concepts related to racial justice and allyship

    New Auto-Interp
    Negative Logits
    _BATCH
    -0.16
    ellas
    -0.15
    apos
    -0.14
    arness
    -0.14
    ajas
    -0.14
     życ
    -0.14
    ãĥ¥ãĥ¼
    -0.14
    Ñĩний
    -0.13
    abad
    -0.13
    ÑĬ
    -0.13
    POSITIVE LOGITS
    essler
    0.15
    oke
    0.15
    Finder
    0.15
    eshire
    0.14
    ë¹Ļ
    0.14
    -mean
    0.14
    adil
    0.13
     balance
    0.13
     Heard
    0.13
     Derrick
    0.13
    Act Density 0.162%

    No Known Activations