INDEX
Explanations
profane and offensive language
explicit and vulgar language
New Auto-Interp
Negative Logits
Completed
-0.75
INAL
-0.74
utterstock
-0.72
guiActiveUn
-0.71
arcity
-0.70
istar
-0.69
reluct
-0.69
PsyNetMessage
-0.68
CONCLUS
-0.66
OSP
-0.66
POSITIVE LOGITS
hole
1.03
holes
0.90
shit
0.86
face
0.83
xual
0.82
fuck
0.81
mma
0.81
manship
0.79
pants
0.77
nuts
0.76
Activations Density 0.051%