INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     hidrat
    -0.09
     petals
    -0.09
     BC
    -0.08
     Doggy
    -0.08
    çek
    -0.08
     pedal
    -0.08
     Eup
    -0.08
    isses
    -0.08
     dục
    -0.08
    ότε
    -0.08
    POSITIVE LOGITS
    athons
    0.08
    -able
    0.08
     chance
    0.07
    azard
    0.07
     looph
    0.07
     saba
    0.07
    need
    0.07
     entry
    0.07
     Mas
    0.07
     lim
    0.07
    Act Density 0.006%

    No Known Activations