INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Oscars
    -0.06
     Properties
    -0.06
     Confidence
    -0.06
    urrets
    -0.06
    lay
    -0.06
    -0.06
    Reuters
    -0.06
     كبير
    -0.06
    Ing
    -0.06
     automobiles
    -0.06
    POSITIVE LOGITS
    .Boolean
    0.07
    (Int
    0.07
    (range
    0.07
    [string
    0.07
    @js
    0.06
     reim
    0.06
    (drop
    0.06
     svaz
    0.06
                  
    0.06
    一步
    0.06
    Act Density 0.011%

    No Known Activations