INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     trimester
    -0.08
     notation
    -0.08
     wadanda
    -0.08
     deform
    -0.08
     setw
    -0.08
     ubw
    -0.07
     gemeins
    -0.07
    орту
    -0.07
     उपकरण
    -0.07
     squeeze
    -0.07
    POSITIVE LOGITS
     درباره
    0.09
    about
    0.09
    many
    0.09
     كثير
    0.08
    很多
    0.08
    Esta
    0.08
     describing
    0.08
    /article
    0.08
    افت
    0.08
    Headline
    0.08
    Act Density 0.004%

    No Known Activations