INDEX
Explanations
emotionally charged language
profanity and exclamations
Profanity
New Auto-Interp
Negative Logits
GenerationType
-0.79
OGND
-0.71
ModelExpression
-0.69
+#+#
-0.67
ècie
-0.66
下载附件
-0.66
ніципалі
-0.64
цездатний
-0.64
חיצוניים
-0.63
GEBURTSDATUM
-0.63
POSITIVE LOGITS
tf
0.71
ass
0.62
god
0.56
god
0.53
TF
0.52
diab
0.52
tf
0.51
ass
0.51
Fucking
0.46
asf
0.46
Activations Density 0.892%