INDEX
    Explanations

    periods at the end of sentences

    New Auto-Interp
    Negative Logits
     personality
    -0.74
     unex
    -0.66
     disemb
    -0.66
    unts
    -0.65
     pocket
    -0.65
     clos
    -0.65
     pillar
    -0.63
     involuntary
    -0.63
     tyr
    -0.63
     utter
    -0.62
    POSITIVE LOGITS
     Unfortunately
    1.40
     Ideally
    1.36
     Instead
    1.17
     Otherwise
    1.13
     Specifically
    1.13
     Fortunately
    1.11
     Luckily
    1.09
     Sadly
    1.08
     Doing
    1.06
     But
    1.04
    Act Density 0.523%

    No Known Activations