INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ем
    -0.08
     Horizons
    -0.08
    utrients
    -0.08
    Ltd
    -0.08
     Eric
    -0.08
     эм
    -0.08
    muş
    -0.08
     لک
    -0.08
    álu
    -0.08
    aktiv
    -0.07
    POSITIVE LOGITS
     disparity
    0.11
     disparities
    0.10
     unequal
    0.10
     verschil
    0.09
     absoluta
    0.09
     rozd
    0.09
     difference
    0.09
     hierarchical
    0.08
     favors
    0.08
     favorit
    0.08
    Act Density 0.099%

    No Known Activations