INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ensuring
    -0.08
    注明
    -0.08
     bb
    -0.07
     student
    -0.07
     iba't
    -0.07
     Soc
    -0.07
    icol
    -0.07
     দাব
    -0.07
    081
    -0.07
    Lin
    -0.07
    POSITIVE LOGITS
     свою
    0.09
     ukuthi
    0.08
     rằng
    0.08
     Mest
    0.08
     coincidence
    0.08
     πως
    0.08
    .once
    0.08
     بىر
    0.07
     crowdfunding
    0.07
     Jay
    0.07
    Act Density 0.019%

    No Known Activations