INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Bip
    -0.07
    _both
    -0.07
     Selected
    -0.07
    -0.07
    مرض
    -0.06
     tomatoes
    -0.06
    MISS
    -0.06
    iblings
    -0.06
    ]%
    -0.06
     classify
    -0.06
    POSITIVE LOGITS
    ournée
    0.07
     forb
    0.07
     progression
    0.07
    .Warn
    0.07
     construction
    0.06
     solely
    0.06
    0.06
    0.06
     Dickinson
    0.06
    0.06
    Act Density 0.365%

    No Known Activations