INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	pr
    -0.07
    keiten
    -0.06
    ogi
    -0.06
     commented
    -0.06
    만원입니다
    -0.06
    <tag
    -0.06
     вико
    -0.06
     ш
    -0.06
     하면
    -0.06
    AttributedString
    -0.06
    POSITIVE LOGITS
    Filed
    0.07
    .sub
    0.07
     moms
    0.07
    SHOT
    0.07
     DIS
    0.06
     Moms
    0.06
     EH
    0.06
     exhaustive
    0.06
     dying
    0.06
     sadd
    0.06
    Act Density 1.166%

    No Known Activations