INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ك
    1.45
    1.32
     as
    1.30
    ف
    1.30
    ק
    1.24
    ;
    1.23
    ص
    1.23
    ل
    1.20
    ا
    1.18
    1.18
    POSITIVE LOGITS
    1.23
    4
    1.08
    atir
    1.03
    5
    1.02
    ۹
    1.02
    им
    1.02
    ukul
    0.96
    एस
    0.94
    ин
    0.93
    this
    0.93
    Act Density 0.011%

    No Known Activations