INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bootstrap
    -0.09
     каждому
    -0.08
    bootstrap
    -0.08
    Bootstrap
    -0.07
     jeden
    -0.07
     doma
    -0.07
     correlate
    -0.07
     correlations
    -0.07
     neph
    -0.07
     adhesives
    -0.07
    POSITIVE LOGITS
    702
    0.09
     پاران
    0.08
     або
    0.08
     Ending
    0.08
     څخه
    0.08
    Ending
    0.08
    Until
    0.08
    ending
    0.08
    _pow
    0.08
    pow
    0.07
    Act Density 0.006%

    No Known Activations