INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    012
    -0.07
     Cold
    -0.07
     cold
    -0.07
     patterns
    -0.06
    -derived
    -0.06
     politique
    -0.06
    -headed
    -0.06
    .dynamic
    -0.06
     Defensive
    -0.06
     жит
    -0.06
    POSITIVE LOGITS
    _SEQUENCE
    0.07
    ếp
    0.06
    PFN
    0.06
     exclus
    0.06
    	RETURN
    0.06
     Took
    0.06
    alloween
    0.06
     Geç
    0.06
     Marxist
    0.06
    ames
    0.06
    Act Density 0.015%

    No Known Activations