INDEX
    Explanations

    personal identity and classification statements

    repeated patterns or affirmations of identity, particularly related to being a woman and racial identity

    New Auto-Interp
    Negative Logits
     snacks
    -0.77
     fortun
    -0.71
     publicity
    -0.70
     seiz
    -0.69
    çīĪ
    -0.67
     telesc
    -0.63
     ACS
    -0.63
     Circus
    -0.63
     srfAttach
    -0.63
    ãĥ¼ãĥĨ
    -0.63
    POSITIVE LOGITS
    Ļ
    1.24
    ¤
    1.21
    ª
    1.15
    ¬
    1.13
    £
    1.08
    Ĵ
    1.07
    ħ
    1.06
    ĸ
    1.05
    Ķ
    1.05
    ¼
    1.02
    Act Density 0.200%

    No Known Activations