INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    强调
    -0.08
     classroom
    -0.08
     emphasizing
    -0.08
    'ac
    -0.07
    ummet
    -0.07
     contractions
    -0.07
     skol
    -0.07
    水果
    -0.07
     Classroom
    -0.07
    acem
    -0.07
    POSITIVE LOGITS
     содержание
    0.09
     wiw
    0.08
     Е
    0.08
     coeff
    0.08
     Lars
    0.08
     Türkmenistan
    0.08
    дын
    0.08
     біз
    0.08
     coef
    0.08
     Mede
    0.08
    Act Density 0.014%

    No Known Activations