INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ীদ
    2.50
    itarian
    2.38
    famil
    2.22
     braced
    2.19
     shí
    2.18
     syringe
    2.16
    𝘐
    2.14
    Xaml
    2.09
    2.07
    2.06
    POSITIVE LOGITS
    an
    3.72
    at
    3.08
    م
    2.91
    на
    2.61
    2.56
    ب
    2.50
    2.44
    ל
    2.42
    σ
    2.38
    ва
    2.37
    Act Density 0.021%

    No Known Activations