INDEX
Explanations
specific language that expresses condemnation or hostility towards groups of people, particularly reflecting themes of racism and discrimination
derogatory insults
New Auto-Interp
Negative Logits
الرياضيه
-0.53
genial
-0.47
iastes
-0.46
+#+#
-0.44
OGND
-0.44
UIFont
-0.42
sidemargin
-0.42
ArrowToggle
-0.42
involunt
-0.41
AsUp
-0.41
POSITIVE LOGITS
stupid
0.67
smelly
0.60
pesky
0.59
stupid
0.58
foreigners
0.56
foreigner
0.56
inferior
0.55
barbaric
0.55
useless
0.54
extranjera
0.54
Activations Density 0.118%