INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     مذه
    -0.07
    =line
    -0.07
     Lear
    -0.06
     acceler
    -0.06
     İtalya
    -0.06
    iming
    -0.06
    lení
    -0.06
    ificates
    -0.06
    esan
    -0.06
     fool
    -0.06
    POSITIVE LOGITS
    .Select
    0.07
    -many
    0.07
     punto
    0.06
     Pune
    0.06
    advance
    0.06
     downstream
    0.06
     vil
    0.06
     renowned
    0.06
     interv
    0.06
     duly
    0.06
    Act Density 0.037%

    No Known Activations