INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (serv
    -0.07
    采购
    -0.07
     وجود
    -0.07
     indiscrim
    -0.07
     WWE
    -0.07
    ));//
    -0.06
     polling
    -0.06
    .receiver
    -0.06
    refs
    -0.06
    Mailer
    -0.06
    POSITIVE LOGITS
    0.06
     elde
    0.06
     boyut
    0.06
     Or
    0.06
    .pipe
    0.06
     Больш
    0.06
     громад
    0.06
     Leopard
    0.06
    _mk
    0.05
    行为
    0.05
    Act Density 0.027%

    No Known Activations