INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    на
    1.53
    رین
    1.40
    ST
    1.39
    ר
    1.38
    ر
    1.38
    ל
    1.30
    س
    1.26
    ی
    1.25
    ים
    1.19
    ה
    1.19
    POSITIVE LOGITS
    is
    1.43
    ával
    1.22
    1.22
    ier
    1.21
    ư
    1.21
    ud
    1.20
    ä
    1.19
    ú
    1.13
    áln
    1.11
     necessário
    1.10
    Act Density 0.020%

    No Known Activations