INDEX
Explanations
keywords related to discrimination based on various factors like race, religion, gender, and age
terms related to discrimination and its various forms
New Auto-Interp
Negative Logits
TOR
-0.74
ski
-0.72
Adds
-0.72
bold
-0.71
notes
-0.71
DCS
-0.69
links
-0.66
VIDEO
-0.66
cycle
-0.65
spin
-0.65
POSITIVE LOGITS
discrimination
0.94
rimination
0.92
prejudice
0.91
discriminating
0.82
yip
0.81
retaliation
0.81
prejud
0.80
discriminated
0.78
ially
0.77
Discrimination
0.76
Activations Density 0.048%