INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     considered
    -1.98
     Considered
    -1.78
    considered
    -1.75
     considerados
    -1.42
     considérée
    -1.42
     considéré
    -1.41
     considerado
    -1.41
     consideradas
    -1.38
     regarded
    -1.31
     considerada
    -1.31
    POSITIVE LOGITS
     to
    0.75
     by
    0.75
     its
    0.64
     the
    0.62
     a
    0.61
     Wilk
    0.60
     abusive
    0.57
     it
    0.57
     as
    0.57
     lawful
    0.56
    Act Density 0.044%

    No Known Activations