INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    t
    1.54
    ian
    1.13
    ine
    1.08
    al
    1.04
    ure
    1.01
    lington
    1.01
     to
    0.95
    niveau
    0.94
    el
    0.94
    ous
    0.94
    POSITIVE LOGITS
    ف
    1.67
    1.26
    ي
    1.20
     a
    1.14
    1.14
    1.11
    ру
    1.06
    ق
    1.04
    ко
    1.03
    اء
    1.01
    Act Density 0.000%

    No Known Activations