INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     holdings
    -0.09
     sb
    -0.08
    -0.08
    وط
    -0.08
    holds
    -0.07
     what's
    -0.07
    (material
    -0.07
    -0.07
    (sb
    -0.07
    sb
    -0.07
    POSITIVE LOGITS
     perror
    0.13
    Warning
    0.13
    .Errorf
    0.13
    _warning
    0.13
    .warning
    0.12
     warning
    0.12
     disappointed
    0.12
    warning
    0.12
     Warning
    0.12
     erreur
    0.11
    Act Density 0.015%

    No Known Activations