INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    iya
    -0.07
    abcdefghijkl
    -0.06
    -direct
    -0.06
    foundation
    -0.06
    odel
    -0.06
    CMS
    -0.06
     workaround
    -0.06
    Bron
    -0.06
    аки
    -0.06
    .ke
    -0.06
    POSITIVE LOGITS
     bir
    0.06
     itemName
    0.06
    itamin
    0.06
    ért
    0.06
     Linkedin
    0.06
    ADDING
    0.06
     pushed
    0.06
    лены
    0.06
     وات
    0.06
     congestion
    0.06
    Act Density 0.011%

    No Known Activations