INDEX
    Explanations

    warning messages in text

    warnings or notifications regarding inappropriate or graphic content

    New Auto-Interp
    Negative Logits
     inertia
    -0.79
    --+
    -0.74
     Recovery
    -0.73
     staggered
    -0.72
    buck
    -0.71
     retire
    -0.69
     waiting
    -0.69
    town
    -0.68
     swoop
    -0.68
     patiently
    -0.66
    POSITIVE LOGITS
     pornographic
    1.63
     nudity
    1.58
     objectionable
    1.37
     depictions
    1.33
     depicting
    1.32
     satire
    1.30
     derogatory
    1.26
     lewd
    1.25
     misogyn
    1.24
     blasp
    1.22
    Act Density 0.530%

    No Known Activations