INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ס
    1.48
    ين
    1.48
    ك
    1.48
    ן
    1.30
    ну
    1.20
    ية
    1.19
    تي
    1.16
    كين
    1.13
    جي
    1.11
    يب
    1.10
    POSITIVE LOGITS
     for
    1.24
     $\
    1.19
     as
    1.12
     virtualization
    1.09
    اری
    1.07
    Α
    1.07
    ated
    1.06
    el
    1.05
     water
    1.03
     rewriting
    1.02
    Act Density 0.007%

    No Known Activations