INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     certificate
    -0.07
     pione
    -0.07
     directed
    -0.06
     perks
    -0.06
     gubern
    -0.06
    ertificate
    -0.06
     volleyball
    -0.06
     freely
    -0.06
     captain
    -0.06
     bulletin
    -0.06
    POSITIVE LOGITS
     Inline
    0.09
     Invalid
    0.09
     invalid
    0.08
    invalid
    0.08
     Unsafe
    0.08
     insecurity
    0.08
    .invalid
    0.07
     INA
    0.07
    Inline
    0.07
     inadequate
    0.07
    Act Density 0.086%

    No Known Activations