INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    asion
    -0.07
    aned
    -0.07
    -0.07
     Him
    -0.07
     defend
    -0.06
     analyse
    -0.06
    وبة
    -0.06
    .downcase
    -0.06
     compete
    -0.06
     crappy
    -0.06
    POSITIVE LOGITS
    mkdir
    0.10
     mkdir
    0.09
    kdir
    0.08
    	mkdir
    0.07
    uclear
    0.06
     Jad
    0.06
    schemas
    0.06
     resigned
    0.06
    PCI
    0.06
     ма
    0.06
    Act Density 0.001%

    No Known Activations