INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     victim
    -0.08
     efficiently
    -0.07
     арх
    -0.07
     tremendous
    -0.07
     competency
    -0.07
     рек
    -0.07
    _EDGE
    -0.07
     effectively
    -0.07
     decking
    -0.06
    _DOMAIN
    -0.06
    POSITIVE LOGITS
     pause
    0.16
     pauses
    0.13
    Pause
    0.12
    pause
    0.12
     Pause
    0.12
     paused
    0.11
    _pause
    0.09
    .pause
    0.08
    _PAUSE
    0.08
    ause
    0.07
    Act Density 0.004%

    No Known Activations