INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    d
    1.62
    v
    1.50
    z
    1.45
    et
    1.36
    V
    1.27
    "
    1.26
    B
    1.24
     on
    1.19
    л
    1.18
    m
    1.16
    POSITIVE LOGITS
    ین
    1.16
    ول
    1.06
    یل
    0.97
    ाने
    0.96
    0.96
    ف
    0.95
    الی
    0.95
    ફેદ
    0.93
    но
    0.92
     μπορεί
    0.91
    Act Density 0.000%

    No Known Activations