INDEX
Explanations
terms related to systemic issues or concerns
themes related to systemic issues and discrimination
New Auto-Interp
Negative Logits
nance
-0.92
isher
-0.85
nery
-0.84
etsk
-0.83
ramer
-0.76
furt
-0.72
quet
-0.71
quer
-0.71
ishable
-0.71
ishers
-0.69
POSITIVE LOGITS
ultural
0.76
systemic
0.76
sclerosis
0.76
helle
0.72
racism
0.70
societal
0.67
failures
0.67
offender
0.65
effects
0.65
proportions
0.65
Activations Density 0.031%