INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ش
    1.58
     an
    1.31
    c
    1.15
    1.14
    d
    1.13
    p
    1.13
    1.12
    1.12
    هم
    1.09
    ch
    1.05
    POSITIVE LOGITS
    ות
    1.42
    1.29
    at
    1.22
    ла
    1.22
    1.22
    ட்ட
    1.12
    1.09
    u
    1.05
    לים
    1.01
    רץ
    1.00
    Act Density 0.087%

    No Known Activations