INDEX
    Explanations

    instances of conversational markers and expressions of enthusiasm

    New Auto-Interp
    Negative Logits
    Alongside
    -0.80
     :')
    -0.79
     Alongside
    -0.77
    AndEndTag
    -0.75
    klart
    -0.72
     ;-;
    -0.71
     ͡°
    -0.69
     subreddit
    -0.69
     screenshot
    -0.68
     tryna
    -0.68
    POSITIVE LOGITS
     BTW
    0.89
    BTW
    0.88
     muß
    0.87
    IMHO
    0.76
     müßte
    0.73
    OK
    0.72
     läßt
    0.72
    Надо
    0.72
     mußte
    0.71
    Thanx
    0.69
    Act Density 0.469%

    No Known Activations