INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.62
    0.58
    ۹
    0.58
    ون
    0.55
    ی
    0.55
    ري
    0.52
    0.52
    ۳
    0.49
    ۴
    0.47
    لي
    0.46
    POSITIVE LOGITS
    m
    0.57
    '
    0.56
     nivel
    0.50
    b
    0.47
    is
    0.46
     level
    0.46
    ol
    0.44
     niveau
    0.44
    ad
    0.43
     poziom
    0.43
    Act Density 0.076%

    No Known Activations