INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     примен
    -0.08
    	MD
    -0.07
    LEEP
    -0.07
     cloning
    -0.07
    ldr
    -0.07
    	DBG
    -0.07
    	win
    -0.07
     pollen
    -0.06
     pleasing
    -0.06
    90
    -0.06
    POSITIVE LOGITS
    AS
    0.08
     như
    0.08
     which
    0.08
     as
    0.07
     `\
    0.07
     Sap
    0.07
    as
    0.07
    eter
    0.07
    {\
    0.07
     abolish
    0.07
    Act Density 0.075%

    No Known Activations