INDEX
    Explanations

    Mathematical expressions

    New Auto-Interp
    Negative Logits
     mf
    -0.08
     tids
    -0.08
     satisfe
    -0.08
     mwen
    -0.08
    ,m
    -0.08
     mell
    -0.07
    يغ
    -0.07
     mc
    -0.07
     mg
    -0.07
    achine
    -0.07
    POSITIVE LOGITS
     oppure
    0.08
     Novak
    0.08
    .au
    0.08
     همچنین
    0.08
     Ż
    0.08
     이렇게
    0.07
     ઉપરાંત
    0.07
    .dw
    0.07
     또한
    0.07
     chamado
    0.07
    Act Density 0.064%

    No Known Activations