INDEX
    Explanations

    phrases indicative of guidance or advice in various contexts

    New Auto-Interp
    Negative Logits
     pleaſure
    -0.96
     houſe
    -0.90
     itſelf
    -0.90
    ſelf
    -0.89
     himſelf
    -0.85
     poffible
    -0.84
     Jefus
    -0.84
    findpost
    -0.84
    setVerticalGroup
    -0.82
     purpoſe
    -0.82
    POSITIVE LOGITS
     fucked
    0.68
    ...
    0.67
     ça
    0.66
    0.66
     nice
    0.66
     haha
    0.65
     shitty
    0.65
     stupidly
    0.65
     fucking
    0.63
     shit
    0.62
    Act Density 0.030%

    No Known Activations