INDEX
    Explanations

    common sentence structure

    New Auto-Interp
    Negative Logits
     of
    1.54
     at
    1.39
    dır
    1.34
    ש
    1.28
    1.20
     وزیر
    1.06
    ח
    1.05
    ِی
    1.05
     from
    1.04
    の影響
    1.04
    POSITIVE LOGITS
    a
    1.09
    nél
    1.05
    1.05
    uje
    1.04
    ون
    1.03
    بود
    1.02
    ého
    0.99
    ä
    0.96
    ých
    0.96
    an
    0.95
    Act Density 0.068%

    No Known Activations