INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     greets
    1.06
     zusätzlich
    1.05
     jeweils
    1.03
     લાખ
    1.03
     llegar
    1.02
     corresponds
    1.00
     ढंग
    0.96
    ็ว
    0.96
     혹은
    0.96
     respectiv
    0.96
    POSITIVE LOGITS
    د
    1.48
    1.32
    ного
    1.29
    IA
    1.27
    ص
    1.24
    ח
    1.23
    ح
    1.23
    الأ
    1.20
    υτό
    1.19
    1.16
    Act Density 0.097%

    No Known Activations