INDEX
    Explanations

    dialogue that involves advice and reflection on personal growth or accountability

    New Auto-Interp
    Negative Logits
    fuck
    -0.19
     fucked
    -0.18
     fuck
    -0.17
     FUCK
    -0.17
     fucks
    -0.15
     Fucking
    -0.15
     Fuck
    -0.15
     fucking
    -0.15
     rapes
    -0.14
    Fuck
    -0.14
    POSITIVE LOGITS
     buddy
    0.23
     partner
    0.20
     buddies
    0.20
     fellow
    0.18
     boss
    0.17
     brother
    0.17
     amigo
    0.17
     accomp
    0.17
     intimidating
    0.17
     mentor
    0.17
    Act Density 0.158%

    No Known Activations