INDEX
    Explanations

    phrases related to controversial or divisive topics and imagery

    elements depicting same-sex relationships or LGBTQ+ themes

    New Auto-Interp
    Negative Logits
     Pwr
    -0.81
     Emails
    -0.76
    Initial
    -0.74
    WAYS
    -0.73
     Applications
    -0.72
    asons
    -0.72
    bors
    -0.71
    rences
    -0.71
    imester
    -0.70
    effective
    -0.70
    POSITIVE LOGITS
     nude
    1.34
     grinning
    1.31
     smiling
    1.30
     naked
    1.27
     silhou
    1.27
     decap
    1.26
     bearded
    1.24
     silhouette
    1.20
     clothed
    1.17
     caricature
    1.16
    Act Density 0.461%

    No Known Activations