INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    و
    2.06
    ت
    1.84
    ل
    1.76
    ہ
    1.72
    ن
    1.60
    ي
    1.57
    י
    1.56
    i
    1.53
    ق
    1.43
    ul
    1.32
    POSITIVE LOGITS
    0.93
     کھیلو
    0.89
     lưu
    0.87
    𝙖
    0.87
    𝙡
    0.85
    ON
    0.84
    𝙤
    0.82
    ρι
    0.81
    0.81
    กับ
    0.80
    Act Density 0.002%

    No Known Activations