INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Tmp
    -0.07
    trash
    -0.06
     cd
    -0.06
    ardır
    -0.06
    _tim
    -0.06
    /fa
    -0.06
     chk
    -0.06
    _ING
    -0.06
    ろう
    -0.06
    ]="
    -0.06
    POSITIVE LOGITS
     başarı
    0.07
     knowledgeable
    0.07
    ucle
    0.07
     ******************************************************************************↵
    0.07
     undertake
    0.06
    /************************************************
    0.06
     done
    0.06
     authors
    0.06
     sampled
    0.06
     correspondence
    0.06
    Act Density 0.001%

    No Known Activations