INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     beaucoup
    -0.06
     surrogate
    -0.06
    lerle
    -0.06
    -0.06
    _wave
    -0.06
     MLA
    -0.06
     crater
    -0.06
    Ngoài
    -0.06
     firstName
    -0.06
     Root
    -0.06
    POSITIVE LOGITS
     Greenwich
    0.08
    0.07
    ח
    0.07
    .AutoScaleMode
    0.07
    orney
    0.07
     polling
    0.07
    ignet
    0.06
    )|(
    0.06
    gt
    0.06
    hazi
    0.06
    Act Density 0.001%

    No Known Activations