INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    =z
    -0.07
    стати
    -0.07
    (pin
    -0.07
     studied
    -0.07
    noho
    -0.06
    (level
    -0.06
    pkt
    -0.06
     gerektir
    -0.06
    Accounts
    -0.06
    (pic
    -0.06
    POSITIVE LOGITS
     Inputs
    0.07
     Crushing
    0.07
     makes
    0.06
     spline
    0.06
     Rick
    0.06
     ліка
    0.06
     BH
    0.06
     qualities
    0.06
     unknow
    0.06
    а
    0.06
    Act Density 0.019%

    No Known Activations