INDEX
    Explanations

    concepts related to equality and fairness in societal contexts

    New Auto-Interp
    Negative Logits
    oste
    -0.07
    riad
    -0.07
    olo
    -0.07
     SPDX
    -0.06
    olon
    -0.06
    altar
    -0.06
    ante
    -0.06
    éĴ®
    -0.06
     âĹĦ
    -0.06
    ülü
    -0.06
    POSITIVE LOGITS
     equal
    0.23
     equally
    0.20
    equal
    0.19
     Equal
    0.19
     igual
    0.19
    Equal
    0.18
     EQUAL
    0.18
     eÅŁit
    0.16
     equality
    0.16
    _equal
    0.15
    Act Density 0.196%

    No Known Activations