INDEX
    Explanations

    email addresses in text

    mentions of email addresses or social media handles

    New Auto-Interp
    Negative Logits
     Catal
    -0.64
     gambling
    -0.60
     acqu
    -0.58
     reward
    -0.58
     torn
    -0.57
     analges
    -0.57
     Ninth
    -0.56
     attempt
    -0.56
     decl
    -0.56
     Kidd
    -0.55
    POSITIVE LOGITS
    @
    4.06
     @
    2.04
     "@
    1.84
    @#
    1.66
     (@
    1.49
    #
    1.27
    ://
    1.26
    =#
    1.14
    @@
    1.11
    email
    1.04
    Act Density 0.013%

    No Known Activations