INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     surf
    -0.08
     AJ
    -0.08
    AJ
    -0.08
     bann
    -0.08
    Clay
    -0.07
     sleeves
    -0.07
     Katr
    -0.07
     denom
    -0.07
     কো
    -0.07
     controls
    -0.07
    POSITIVE LOGITS
     elastic
    0.08
    0.08
    0.07
     kec
    0.07
     Ital
    0.07
     себ
    0.07
    0.07
     stre
    0.07
     Using
    0.07
     pat
    0.07
    Act Density 0.010%

    No Known Activations