INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tim
    -0.07
    Rh
    -0.07
    חי
    -0.07
     başka
    -0.07
     Another
    -0.07
    TIM
    -0.07
     πρό
    -0.07
     Breit
    -0.07
     Rhythm
    -0.07
     Reynolds
    -0.07
    POSITIVE LOGITS
     overnight
    0.09
     hamburger
    0.09
     مدت
    0.08
    至少
    0.08
     minstens
    0.08
    <Device
    0.08
     vähemalt
    0.08
     dauern
    0.08
    0.07
     activate
    0.07
    Act Density 0.010%

    No Known Activations