INDEX
Explanations
references to nudity and sexual themes
It fires on explicit sexual content and other highly charged/taboo or sensational words (sexually explicit terms, strong insults/urges, and provocative descriptors).
New Auto-Interp
Negative Logits
ModelExpression
-0.88
purpoſe
-0.81
CreateTagHelper
-0.81
myſelf
-0.77
ſeveral
-0.75
perſon
-0.74
ImageContext
-0.73
'\\;'
-0.72
greateſt
-0.72
rungsseite
-0.70
POSITIVE LOGITS
googleapis
0.54
theory
0.54
theory
0.48
teoría
0.47
(
0.46
Theory
0.46
sp
0.45
or
0.44
щика
0.44
az
0.44
Activations Density 1.120%