INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ंख
    -0.07
     причины
    -0.07
    |\
    -0.06
    _sup
    -0.06
     jaar
    -0.06
    _dropout
    -0.06
    assing
    -0.06
    Squared
    -0.06
    _EX
    -0.06
     que
    -0.06
    POSITIVE LOGITS
     pylab
    0.07
     Coron
    0.06
    consum
    0.06
    шается
    0.06
    Original
    0.06
     macros
    0.06
    Sal
    0.06
    /product
    0.06
    .goto
    0.06
    0.06
    Act Density 0.001%

    No Known Activations