INDEX
    Explanations

    occurrences of the prefix "uns," indicating negation or absence

    New Auto-Interp
    Negative Logits
    zin
    -0.09
    zan
    -0.08
    ervo
    -0.08
    etic
    -0.08
    SYNC
    -0.07
    baÅŁ
    -0.07
    zd
    -0.07
    hop
    -0.07
    hs
    -0.07
    aver
    -0.07
    POSITIVE LOGITS
     uns
    0.08
     Uns
    0.08
    d
    0.08
    utom
    0.07
    ar
    0.07
    y
    0.06
    ا
    0.06
    paring
    0.06
    air
    0.06
    ward
    0.06
    Act Density 0.006%

    No Known Activations