INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ITER
    -0.07
    ير
    -0.07
     클래스
    -0.06
    иск
    -0.06
     Cond
    -0.06
     dividing
    -0.06
    -prop
    -0.06
     Crafts
    -0.06
    學校
    -0.06
    Contain
    -0.06
    POSITIVE LOGITS
    ancements
    0.07
     самостоятель
    0.07
    enuity
    0.07
    Safety
    0.07
    ARIANT
    0.07
    	ImGui
    0.07
     DD
    0.06
     clearing
    0.06
     MB
    0.06
     CGI
    0.06
    Act Density 0.091%

    No Known Activations