INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bizim
    -0.08
    这一
    -0.08
     propensity
    -0.08
     offrent
    -0.08
     جی
    -0.08
     Kitchen
    -0.08
    Kitchen
    -0.07
     کام
    -0.07
     loin
    -0.07
     부족
    -0.07
    POSITIVE LOGITS
     want
    0.11
     চাই
    0.09
     worry
    0.09
     consider
    0.08
     берем
    0.08
     beachten
    0.08
     feststellen
    0.08
     revisar
    0.08
     obtaining
    0.08
     можем
    0.08
    Act Density 0.041%

    No Known Activations