INDEX
    Explanations

    strong curse words

    strong profanity and expressions of frustration

    New Auto-Interp
    Negative Logits
    Edit
    -0.86
    BIL
    -0.86
    inel
    -0.82
    ãĤ¢ãĥ«
    -0.82
    knit
    -0.81
    irtual
    -0.81
     behavi
    -0.75
     Flavoring
    -0.75
    opian
    -0.74
    Msg
    -0.74
    POSITIVE LOGITS
     kidding
    0.97
     hell
    0.89
     bastard
    0.89
     idiot
    0.87
     retard
    0.84
     stink
    0.81
     asshole
    0.80
     shit
    0.80
     thing
    0.79
     idiots
    0.79
    Act Density 0.054%

    No Known Activations