INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (Page
    -0.07
    	lua
    -0.07
     Sanayi
    -0.07
    HD
    -0.07
     amo
    -0.07
    (od
    -0.07
    ffective
    -0.07
     Rück
    -0.07
     Peters
    -0.06
     hd
    -0.06
    POSITIVE LOGITS
     thanks
    0.10
     gracias
    0.10
     благодаря
    0.09
     grâce
    0.07
    дяки
    0.07
     courtesy
    0.07
    Thanks
    0.07
     sayesinde
    0.07
    ocalypse
    0.07
     thank
    0.07
    Act Density 0.015%

    No Known Activations