INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    vang
    -0.08
    sticky
    -0.08
     Zig
    -0.07
     sticky
    -0.07
    Jimmy
    -0.07
    ergen
    -0.07
    -0.07
    ทำ
    -0.07
    xd
    -0.07
    orgt
    -0.07
    POSITIVE LOGITS
     байд
    0.09
     mese
    0.09
     ആറ
    0.08
     মই
    0.08
     Worked
    0.08
     aturan
    0.08
     قانون
    0.08
     сиёс
    0.07
    íocht
    0.07
     ოც
    0.07
    Act Density 0.001%

    No Known Activations