INDEX
    Explanations

    expressions of gratitude and encouragement

    New Auto-Interp
    Negative Logits
    -0.18
     Fucking
    -0.17
     FUCK
    -0.17
     Fuck
    -0.16
     fucks
    -0.15
    Fuck
    -0.15
     fuck
    -0.15
     ·
    -0.15
     fucking
    -0.15
    :↵
    -0.14
    POSITIVE LOGITS
     glad
    0.20
     indeed
    0.20
     Indeed
    0.18
    Indeed
    0.17
     appreciate
    0.17
     yes
    0.17
     agree
    0.17
     Glad
    0.17
    inde
    0.16
     apprec
    0.16
    Act Density 0.079%

    No Known Activations