INDEX
    Explanations

    references to groups of people, including demographics and roles within society

    New Auto-Interp
    Negative Logits
    inci
    -0.16
    coni
    -0.16
    obao
    -0.15
    orge
    -0.15
    rung
    -0.14
    angan
    -0.13
    roti
    -0.13
    urd
    -0.13
     prepend
    -0.13
    illis
    -0.13
    POSITIVE LOGITS
     alike
    1.53
     equally
    0.71
     respectively
    0.57
    ä¸Ģæł·
    0.39
     respective
    0.38
     equal
    0.37
     gleich
    0.36
     similarly
    0.35
     igual
    0.34
     Equal
    0.30
    Act Density 0.160%

    No Known Activations