INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    يك
    1.19
    يل
    1.12
    يب
    1.09
    на
    1.05
    يمة
    0.98
    ков
    0.97
     souci
    0.91
    يكية
    0.88
    માં
    0.87
    كت
    0.86
    POSITIVE LOGITS
    i
    1.40
    :
    1.40
    י
    1.27
     for
    1.21
    1.18
    </h3>
    1.14
    el
    1.12
    ה
    1.12
    ;
    1.11
     accustomed
    1.09
    Act Density 0.002%

    No Known Activations