INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (Date
    -0.07
     батьків
    -0.07
    cılık
    -0.07
    -0.06
     Bac
    -0.06
    -0.06
     verifica
    -0.06
    (repo
    -0.06
     derby
    -0.06
     stu
    -0.06
    POSITIVE LOGITS
    -rich
    0.07
     eternity
    0.06
    ._
    0.06
    .has
    0.06
    954
    0.06
     Apple
    0.06
    مر
    0.06
    .Promise
    0.06
    alytics
    0.06
     pains
    0.06
    Act Density 0.000%

    No Known Activations