INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    كومة
    -0.07
    itesse
    -0.07
    opo
    -0.07
     креп
    -0.06
    061
    -0.06
    legg
    -0.06
     lục
    -0.06
    has
    -0.06
    _NOT
    -0.06
    ы
    -0.06
    POSITIVE LOGITS
     proces
    0.07
    <-
    0.06
     rewritten
    0.06
    $options
    0.06
    	db
    0.06
    .squeeze
    0.06
    ='"
    0.06
    .Assign
    0.06
    .ts
    0.06
     glued
    0.06
    Act Density 0.015%

    No Known Activations