INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ug
    1.10
     исполни
    1.06
    viä
    1.05
    sberg
    1.03
    subsubsection
    0.99
    queda
    0.99
     одновременно
    0.98
    gon
    0.97
    0.96
     equivalently
    0.95
    POSITIVE LOGITS
    ين
    1.54
    د
    1.37
    الأ
    1.33
    ка
    1.25
    1.22
    ك
    1.20
    ن
    1.20
    الإ
    1.17
    الش
    1.16
    مع
    1.15
    Act Density 0.001%

    No Known Activations