INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     такі
    -0.07
    lifetime
    -0.07
     другого
    -0.07
    anding
    -0.07
    judge
    -0.06
     mav
    -0.06
    -0.06
    827
    -0.06
    違い
    -0.06
    istem
    -0.06
    POSITIVE LOGITS
    aph
    0.06
     discharged
    0.06
     orientation
    0.06
     Alibaba
    0.06
     collage
    0.06
    achs
    0.06
     advise
    0.06
     "+↵
    0.06
     Stable
    0.06
     neglig
    0.05
    Act Density 0.001%

    No Known Activations