INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     libraries
    -0.08
     CONTROL
    -0.07
     viewing
    -0.07
     advancing
    -0.07
     resting
    -0.07
    lever
    -0.06
     incremental
    -0.06
     COLLECTION
    -0.06
     embark
    -0.06
    	ti
    -0.06
    POSITIVE LOGITS
     contradict
    0.09
     contrad
    0.07
    되는
    0.07
    BootTest
    0.07
     deduct
    0.07
     그의
    0.06
    рд
    0.06
    كات
    0.06
    icontains
    0.06
    rbrace
    0.06
    Act Density 0.002%

    No Known Activations