INDEX
    Explanations

    providing information or support

    New Auto-Interp
    Negative Logits
    ند
    1.05
     但是
    1.05
    ³.
    1.02
    。",
    1.00
     as
    0.99
    0.99
     I
    0.97
     ٣
    0.96
     الص
    0.94
    لي
    0.93
    POSITIVE LOGITS
    K
    1.82
    T
    1.80
    V
    1.73
    G
    1.66
    ב
    1.63
    an
    1.60
    M
    1.59
    R
    1.57
    N
    1.52
    D
    1.51
    Act Density 0.595%

    No Known Activations