INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     تعیین
    -0.07
     žádný
    -0.07
    าบาล
    -0.07
    enerima
    -0.07
    èmes
    -0.06
     yarar
    -0.06
    icers
    -0.06
    agar
    -0.06
     جديدة
    -0.06
     joven
    -0.06
    POSITIVE LOGITS
     ios
    0.07
     tracing
    0.06
    )get
    0.06
    fails
    0.06
     susp
    0.06
     wrong
    0.06
    0.06
     cds
    0.06
    /or
    0.06
     depos
    0.06
    Act Density 0.003%

    No Known Activations