INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     shampoo
    -0.07
     Monetary
    -0.06
    ,**
    -0.06
    \",
    -0.06
    -best
    -0.06
     rub
    -0.06
    ВА
    -0.06
     compat
    -0.05
     như
    -0.05
    §
    -0.05
    POSITIVE LOGITS
    Direction
    0.07
    -directed
    0.07
    0.06
    xFFFFFFFF
    0.06
    оці
    0.06
     DESC
    0.06
    0.06
    ğitim
    0.06
     CrossAxisAlignment
    0.06
    sın
    0.06
    Act Density 0.021%

    No Known Activations