INDEX
Explanations
references to nudity
references to nudity or nakedness
New Auto-Interp
Negative Logits
riers
-0.83
soType
-0.81
Flavoring
-0.77
rador
-0.72
mental
-0.72
ppa
-0.72
SHIP
-0.71
ffee
-0.70
ãĥ£
-0.70
ãĥį
-0.69
POSITIVE LOGITS
selfies
1.06
mole
0.97
selfie
0.94
photographs
0.92
nude
0.91
silhou
0.89
photos
0.88
pictures
0.84
breasts
0.83
bathing
0.82
Activations Density 0.040%