INDEX
Explanations
references to sexual situations and activities
New Auto-Interp
Negative Logits
sex
-0.16
sexual
-0.16
prostit
-0.15
adult
-0.15
vert
-0.14
disturbed
-0.14
нки
-0.14
adult
-0.14
sexuality
-0.14
adults
-0.14
POSITIVE LOGITS
cum
0.26
Cum
0.25
Cum
0.23
_cum
0.23
cum
0.21
Worship
0.20
worship
0.20
clim
0.19
worsh
0.19
fuck
0.18
Activations Density 0.016%