INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ED
    1.25
    1.13
    ↵↵
    1.09
     of
    1.04
     as
    1.04
    ES
    1.04
    IA
    0.99
    0.97
    EN
    0.96
    де
    0.91
    POSITIVE LOGITS
    نه
    1.02
    <0x0E>
    1.01
    ')
    1.01
    ii
    0.93
     It
    0.91
    0.91
    ڦ
    0.91
    َاب
    0.90
    ä
    0.90
    0.90
    Act Density 0.000%

    No Known Activations