INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kitap
    -0.08
     вода
    -0.06
    Showing
    -0.06
    简单
    -0.06
    альных
    -0.06
     shielding
    -0.06
    Wednesday
    -0.06
     Chính
    -0.06
     speculated
    -0.06
    losures
    -0.06
    POSITIVE LOGITS
    bye
    0.11
     Bye
    0.09
     goodbye
    0.08
     bye
    0.08
    Mi
    0.07
     Aussie
    0.07
    LE
    0.06
     MLB
    0.06
     Binder
    0.06
     neb
    0.06
    Act Density 0.005%

    No Known Activations