INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    //--------------------------------
    -0.07
    Originally
    -0.07
    _LP
    -0.07
    ;line
    -0.07
    (())↵
    -0.07
    _SHIFT
    -0.07
    idity
    -0.06
    lie
    -0.06
    physics
    -0.06
    Block
    -0.06
    POSITIVE LOGITS
     увид
    0.06
     Proper
    0.06
    Grace
    0.06
     پیدا
    0.06
     procedures
    0.06
     rotates
    0.06
     unauthorized
    0.06
     rejection
    0.06
     Ops
    0.06
    ủa
    0.06
    Act Density 0.015%

    No Known Activations