INDEX
    Explanations

    critical or significant terms related to social and cultural identity

    New Auto-Interp
    Negative Logits
     ant
    -0.17
    .ant
    -0.17
    tone
    -0.17
     Pact
    -0.16
    ergic
    -0.15
     Anton
    -0.15
    antine
    -0.15
    jev
    -0.15
    anton
    -0.15
    urr
    -0.15
    POSITIVE LOGITS
    ÅĤaw
    0.17
    ierge
    0.17
    езда
    0.16
    ohn
    0.16
    ienie
    0.16
    inki
    0.15
    radient
    0.15
    lod
    0.15
    inker
    0.14
    è¹
    0.14
    Act Density 0.019%

    No Known Activations