INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     convertible
    -0.08
    例如
    -0.08
    oped
    -0.07
    kbd
    -0.07
    listed
    -0.07
    ensis
    -0.07
     Listed
    -0.07
    written
    -0.07
     outspoken
    -0.07
    lement
    -0.07
    POSITIVE LOGITS
     basics
    0.10
     terminology
    0.10
     introductions
    0.09
     введ
    0.09
     introduce
    0.09
     introduct
    0.09
     establish
    0.09
    	import
    0.08
     prelim
    0.08
     Basics
    0.08
    Act Density 0.023%

    No Known Activations