INDEX
    Explanations

    themes related to inequality and marginalization

    New Auto-Interp
    Negative Logits
    chia
    -0.16
    iju
    -0.16
    illow
    -0.15
    afil
    -0.15
    ãĥ³ãĤ¯
    -0.15
    ijkstra
    -0.14
    rego
    -0.14
    acker
    -0.14
    avou
    -0.14
    anka
    -0.14
    POSITIVE LOGITS
     discrim
    0.34
     marginal
    0.33
     mist
    0.32
     treated
    0.31
     excluded
    0.30
     marg
    0.28
     malt
    0.28
     ignored
    0.28
    -treated
    0.26
     left
    0.25
    Act Density 0.219%

    No Known Activations