INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Temporary
    -0.07
     ува
    -0.07
     удар
    -0.06
     lyn
    -0.06
     integration
    -0.06
    Cookies
    -0.06
     حسب
    -0.06
    severity
    -0.06
    _",
    -0.06
    -ad
    -0.06
    POSITIVE LOGITS
    {{$
    0.07
     перел
    0.06
    COL
    0.06
     Vect
    0.06
     frantic
    0.06
    uilt
    0.06
    общ
    0.06
    Muon
    0.06
    828
    0.06
     Libya
    0.06
    Act Density 0.000%

    No Known Activations