INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Apt
    -0.08
     Doesn't
    -0.08
    ോദ
    -0.08
    _certificate
    -0.08
     Tidak
    -0.07
     Atmospheric
    -0.07
    -Jährige
    -0.07
    ichts
    -0.07
     skipped
    -0.07
     oneself
    -0.07
    POSITIVE LOGITS
    部门
    0.09
     underscores
    0.08
     бизнес
    0.08
    roc
    0.08
     SMEs
    0.08
    unders
    0.08
     wherever
    0.08
     wirtschaft
    0.07
     underscore
    0.07
    lav
    0.07
    Act Density 0.052%

    No Known Activations