INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    
    -0.08
     Mend
    -0.07
    689
    -0.07
    plx
    -0.07
     miles
    -0.06
    latlong
    -0.06
    714
    -0.06
    414
    -0.06
    bie
    -0.06
    واج
    -0.06
    POSITIVE LOGITS
     nin
    0.08
     iT
    0.07
     alignSelf
    0.07
    .ny
    0.07
     vessel
    0.07
     doubly
    0.07
     ž
    0.07
    -cultural
    0.06
     Om
    0.06
    otte
    0.06
    Act Density 0.228%

    No Known Activations