INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kapı
    -0.08
    抽取
    -0.07
     ambigu
    -0.07
     europé
    -0.07
     portray
    -0.07
    (I
    -0.07
     tamp
    -0.06
     cerco
    -0.06
    经贸
    -0.06
     cryptography
    -0.06
    POSITIVE LOGITS
    0.07
     נה
    0.07
    0.06
    0.06
     stainless
    0.06
     fold
    0.06
     Valid
    0.06
    0.06
     initializer
    0.06
    .LayoutControlItem
    0.06
    Act Density 0.002%

    No Known Activations