INDEX
    Explanations

    indications of success in responses or messages

    New Auto-Interp
    Negative Logits
     “
    -0.34
    -0.33
     (
    -0.32
    <h2>
    -0.32
     "
    -0.31
     the
    -0.30
    Bats
    -0.30
     about
    -0.30
     [
    -0.30
    <bos>
    -0.30
    POSITIVE LOGITS
    success
    1.08
    SUCCESS
    1.02
    Success
    1.02
     SUCCESS
    1.01
     Success
    0.99
     success
    0.95
    SuccessListener
    0.94
    uccess
    0.87
     sucess
    0.87
     成功
    0.87
    Act Density 0.013%

    No Known Activations