INDEX
    Explanations

    mentions of violent or controversial events

    references to the term "bath" in various contexts

    New Auto-Interp
    Negative Logits
    Ob
    -0.82
    IFA
    -0.72
    RY
    -0.70
    BT
    -0.70
    HI
    -0.70
     Democr
    -0.69
    APD
    -0.69
    MAN
    -0.69
    ICT
    -0.68
    KI
    -0.68
    POSITIVE LOGITS
    â̦)
    0.84
    â̦
    0.78
    estyles
    0.75
    ãĤ¼ãĤ¦ãĤ¹
    0.70
    ...)
    0.68
     swall
    0.67
     FML
    0.66
    â̦"
    0.64
    estead
    0.63
     hers
    0.63
    Act Density 0.000%

    No Known Activations