INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     തുടങ്ങ
    -0.08
     לך
    -0.08
     تبدأ
    -0.08
    anggal
    -0.08
     alarma
    -0.08
     kapsam
    -0.08
     coverage
    -0.08
     warned
    -0.07
     ukusebenza
    -0.07
     commentator
    -0.07
    POSITIVE LOGITS
    习近平
    0.09
     kre
    0.08
     simb
    0.08
     е
    0.08
    ula
    0.07
     Einen
    0.07
     symbolic
    0.07
     colt
    0.07
     vẻ
    0.07
    uman
    0.07
    Act Density 0.027%

    No Known Activations