INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	pre
    -0.07
    -make
    -0.07
    wide
    -0.07
     raises
    -0.06
     choosing
    -0.06
    はず
    -0.06
     Ji
    -0.06
    	re
    -0.06
    boro
    -0.06
     потому
    -0.06
    POSITIVE LOGITS
     minimal
    0.12
     Minimal
    0.08
    يلا
    0.07
    Graphics
    0.07
     Pixels
    0.07
     nipple
    0.07
    Minimal
    0.06
     minimum
    0.06
    .slim
    0.06
    (Token
    0.06
    Act Density 0.006%

    No Known Activations