INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     RDF
    -0.07
     qb
    -0.06
    _LAYOUT
    -0.06
     Jeffrey
    -0.06
    OTH
    -0.06
     facult
    -0.06
     summarized
    -0.06
    rawer
    -0.06
     бор
    -0.06
    ilitary
    -0.06
    POSITIVE LOGITS
    False
    0.07
    HCI
    0.07
    	float
    0.06
     gcc
    0.06
     don
    0.06
     лит
    0.06
    .imwrite
    0.06
    turn
    0.06
     بالا
    0.06
     cool
    0.06
    Act Density 0.029%

    No Known Activations