INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _FL
    -0.07
    _sc
    -0.07
    enso
    -0.07
     Engines
    -0.07
    uably
    -0.07
    alizace
    -0.07
    ümüzde
    -0.06
    irling
    -0.06
    _SL
    -0.06
    `,
    -0.06
    POSITIVE LOGITS
     mnist
    0.07
    Char
    0.06
     APPLE
    0.06
    .opensource
    0.06
    									  
    0.06
    Abort
    0.06
    basic
    0.06
     Shepherd
    0.06
     proceed
    0.06
    .Roles
    0.06
    Act Density 0.002%

    No Known Activations