INDEX
Explanations
content related to protests and violence, particularly in relation to cartoons depicting religious figures
New Auto-Interp
Negative Logits
ValueStyle
-0.82
protoimpl
-0.72
+#+#
-0.70
UnknownFieldSet
-0.67
phosa
-0.67
+:+
-0.66
onCreateView
-0.65
Wicidata
-0.65
DockStyle
-0.64
nakalista
-0.63
POSITIVE LOGITS
mocking
0.67
derogatory
0.66
insults
0.64
disrespectful
0.62
insulting
0.60
hurtful
0.60
失礼
0.58
dispar
0.58
rude
0.57
degrading
0.57
Activations Density 0.276%