INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Мат
    -0.08
    _fail
    -0.08
     Wife
    -0.07
    ariable
    -0.07
    Apart
    -0.07
    _EXCEPTION
    -0.07
    	fail
    -0.07
     aerospace
    -0.07
     heavens
    -0.07
    -то
    -0.07
    POSITIVE LOGITS
     betre
    0.09
    0.08
    izada
    0.08
    ogat
    0.07
     Palais
    0.07
     massas
    0.07
     press
    0.07
     popping
    0.07
     bouts
    0.07
    cult
    0.07
    Act Density 0.001%

    No Known Activations