INDEX
    Explanations

    references to caste, ethnicity, and socioeconomic status

    New Auto-Interp
    Negative Logits
    arto
    -0.17
     Pron
    -0.16
    ibo
    -0.15
    zen
    -0.14
    eri
    -0.14
    ipo
    -0.14
    undles
    -0.14
     Dy
    -0.14
     env
    -0.13
    lik
    -0.13
    POSITIVE LOGITS
    ë³Ħ
    0.34
    -specific
    0.30
    åĪ¥
    0.26
    pecific
    0.26
    _specific
    0.26
     specific
    0.24
    specific
    0.24
    Specific
    0.23
     specificity
    0.21
     Specific
    0.21
    Act Density 0.195%

    No Known Activations