INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    elts
    -0.08
    就觉得
    -0.07
     Catalonia
    -0.07
    anten
    -0.07
    azzo
    -0.07
    theon
    -0.07
    oppable
    -0.07
    ousedown
    -0.07
    Ú
    -0.06
    رأ
    -0.06
    POSITIVE LOGITS
    タル
    0.08
     scanned
    0.07
     the
    0.07
     medical
    0.07
    duplicate
    0.07
    议案
    0.07
     and
    0.07
     difficulty
    0.07
    되고
    0.07
     deficiency
    0.07
    Act Density 0.023%

    No Known Activations