INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    anish
    -0.08
    dar
    -0.07
     раньше
    -0.07
    kill
    -0.07
     муж
    -0.07
     adhering
    -0.07
     приходится
    -0.07
     wet
    -0.07
     Mercury
    -0.07
     Aggreg
    -0.07
    POSITIVE LOGITS
    推薦
    0.09
    סום
    0.09
     Homemade
    0.09
    usetzen
    0.08
    otherapy
    0.08
     homemade
    0.08
     ostvar
    0.08
     vpn
    0.08
    recommended
    0.08
     Ampl
    0.08
    Act Density 0.002%

    No Known Activations