INDEX
    Explanations

    gradients effectively from

    New Auto-Interp
    Negative Logits
     dựng
    0.74
     sikap
    0.71
    0.70
    subplots
    0.70
     înviat
    0.67
    合わせ
    0.67
     Stand
    0.66
    Stand
    0.64
     تکن
    0.63
    ؔ
    0.62
    POSITIVE LOGITS
     flow
    3.29
     flowing
    3.23
     flows
    3.08
     flowed
    2.96
    flow
    2.84
     Flow
    2.78
    flowing
    2.69
    Flow
    2.64
     stream
    2.55
    2.48
    Act Density 0.974%

    No Known Activations