INDEX
    Explanations

    conversational exchanges and dialogue structure in the text

    New Auto-Interp
    Negative Logits
     tbh
    -0.79
     tasked
    -0.67
     ngl
    -0.67
     impactful
    -0.64
     multiple
    -0.64
     Idk
    -0.63
     idk
    -0.63
     bestie
    -0.63
    Thankfully
    -0.63
     Notably
    -0.62
    POSITIVE LOGITS
     muß
    0.79
    faßt
    0.77
     everybody
    0.70
     lousy
    0.68
    everybody
    0.65
     Daß
    0.65
     somebody
    0.65
     müßte
    0.62
     daß
    0.61
     Everybody
    0.59
    Act Density 1.030%

    No Known Activations