INDEX
    Explanations

    expressions of excitement and encouragement

    New Auto-Interp
    Negative Logits
     :↵↵
    -0.21
     :↵
    -0.19
     ;↵↵
    -0.15
    :↵↵
    -0.15
     .↵↵
    -0.14
    ırak
    -0.14
    ãĥ³ãĥIJ
    -0.14
     ;↵
    -0.13
     jas
    -0.13
    :↵
    -0.13
    POSITIVE LOGITS
     indeed
    0.20
     glad
    0.18
    inde
    0.17
     agree
    0.17
     Glad
    0.17
     definitely
    0.16
    ,Yes
    0.16
     Indeed
    0.15
    Indeed
    0.15
     yes
    0.15
    Act Density 0.128%

    No Known Activations