INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Gabri
    -0.07
    Drug
    -0.06
    .eye
    -0.06
     Invalidate
    -0.06
    *I
    -0.06
     Ray
    -0.06
    _multip
    -0.06
    	done
    -0.06
    -0.06
    >y
    -0.06
    POSITIVE LOGITS
    elder
    0.08
    onz
    0.07
    lassen
    0.07
    。此
    0.07
     ascend
    0.07
     ascent
    0.07
     descended
    0.07
    insic
    0.07
    ubs
    0.07
    ез
    0.07
    Act Density 0.012%

    No Known Activations