INDEX
Explanations
phrases related to controversial or divisive topics and imagery
elements depicting same-sex relationships or LGBTQ+ themes
New Auto-Interp
Negative Logits
Pwr
-0.81
Emails
-0.76
Initial
-0.74
WAYS
-0.73
Applications
-0.72
asons
-0.72
bors
-0.71
rences
-0.71
imester
-0.70
effective
-0.70
POSITIVE LOGITS
nude
1.34
grinning
1.31
smiling
1.30
naked
1.27
silhou
1.27
decap
1.26
bearded
1.24
silhouette
1.20
clothed
1.17
caricature
1.16
Activations Density 0.461%