INDEX
    Explanations

    instances of the phrase "don't be."

    New Auto-Interp
    Negative Logits
    aybe
    -0.16
    bage
    -0.15
    ghan
    -0.14
    ÑĢава
    -0.14
    lest
    -0.14
    abei
    -0.14
    ADI
    -0.14
    andi
    -0.14
    isses
    -0.14
    stoup
    -0.14
    POSITIVE LOGITS
     worry
    0.18
    /do
    0.17
    Go
    0.16
     Go
    0.16
    oice
    0.16
     forget
    0.16
     Allen
    0.16
    Allen
    0.15
     Forget
    0.15
     go
    0.15
    Act Density 0.033%

    No Known Activations