INDEX
Explanations
words related to obscuring or concealing something
terms related to obscenity and obscured elements in the text
New Auto-Interp
Negative Logits
GD
-0.73
DF
-0.71
DS
-0.66
WARD
-0.66
Clover
-0.64
DIT
-0.64
Berk
-0.64
EMOTE
-0.64
DR
-0.64
CTV
-0.63
POSITIVE LOGITS
ĸļ
1.19
Û
1.16
obsc
0.97
ured
0.91
urities
0.87
eki
0.87
uring
0.84
ures
0.82
uration
0.81
offend
0.78
Activations Density 0.011%