INDEX
    Explanations

    when introducing a condition or time

    New Auto-Interp
    Negative Logits
    ال
    1.17
    ent
    1.16
    ت
    1.05
    د
    1.02
    jenigen
    1.01
    كل
    1.00
    ش
    0.98
    ري
    0.95
    ار
    0.92
    ar
    0.91
    POSITIVE LOGITS
     by
    1.27
     ی
    1.22
    بی
    1.21
    К
    1.13
    می
    1.10
    دی
    1.09
    ンの
    1.08
    1.07
    1.05
     そして
    1.04
    Act Density 0.028%

    No Known Activations