INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ()
    ↵
    -0.07
     occup
    -0.06
     label
    -0.06
     Clown
    -0.06
     libs
    -0.06
    ायन
    -0.05
    XmlNode
    -0.05
    admin
    -0.05
    	           
    -0.05
    iedy
    -0.05
    POSITIVE LOGITS
     flea
    0.07
     QUICK
    0.07
    Research
    0.07
    kB
    0.07
    ocious
    0.07
     собира
    0.06
     бли
    0.06
    わせ
    0.06
    charg
    0.06
    /report
    0.06
    Act Density 0.004%

    No Known Activations