INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _snd
    -0.06
    -0.06
    Nonnull
    -0.06
    western
    -0.06
    -0.06
    Distinct
    -0.06
     onNext
    -0.06
    <x
    -0.06
    .goBack
    -0.06
    516
    -0.06
    POSITIVE LOGITS
     зробити
    0.07
    正常
    0.07
     effort
    0.07
    -contact
    0.07
     dobré
    0.06
     takım
    0.06
     dikkat
    0.06
     magna
    0.06
     çünkü
    0.06
     wilt
    0.06
    Act Density 0.011%

    No Known Activations