INDEX
    Explanations

    discussions around societal issues, particularly related to accusations and how they impact individuals

    New Auto-Interp
    Negative Logits
     poffe
    -1.04
     purpoſe
    -1.04
     pleaſure
    -1.03
     deſt
    -0.97
     houſe
    -0.96
     fubject
    -0.95
     ſever
    -0.94
     tranſ
    -0.93
     neceff
    -0.93
     himſelf
    -0.92
    POSITIVE LOGITS
     whatnot
    0.79
     stuff
    0.75
     things
    0.69
     maybe
    0.68
     thingy
    0.66
    ,
    0.64
     et
    0.60
     doings
    0.60
     Maybe
    0.59
    maybe
    0.58
    Act Density 0.407%

    No Known Activations