INDEX
Explanations
terminology and references related to civil rights and discrimination laws
discrimination against protected groups
New Auto-Interp
Negative Logits
varargin
-0.32
CONCER
-0.31
Permission
-0.31
way
-0.29
Noten
-0.28
Größe
-0.28
Szab
-0.27
님
-0.27
Rö
-0.27
uña
-0.27
POSITIVE LOGITS
Prejudice
0.64
homophobic
0.63
Discrimination
0.60
discrimination
0.59
odio
0.59
judice
0.58
OCCURRED
0.57
hate
0.54
enschappelijke
0.54
discrimination
0.54
Activations Density 0.271%