INDEX
    Explanations

    references to nudity and sexual themes

    It fires on explicit sexual content and other highly charged/taboo or sensational words (sexually explicit terms, strong insults/urges, and provocative descriptors).

    New Auto-Interp
    Negative Logits
     ModelExpression
    -0.88
     purpoſe
    -0.81
     CreateTagHelper
    -0.81
     myſelf
    -0.77
     ſeveral
    -0.75
     perſon
    -0.74
    ImageContext
    -0.73
     '\\;'
    -0.72
     greateſt
    -0.72
    rungsseite
    -0.70
    POSITIVE LOGITS
    googleapis
    0.54
     theory
    0.54
    theory
    0.48
     teoría
    0.47
     (
    0.46
    Theory
    0.46
     sp
    0.45
     or
    0.44
    щика
    0.44
     az
    0.44
    Act Density 1.120%

    No Known Activations