INDEX
Explanations
discussions of systemic inequalities and injustices affecting marginalized communities
New Auto-Interp
Negative Logits
rhestr
-0.41
suite
-0.41
nah
-0.38
suites
-0.38
rela
-0.34
request
-0.34
Pilots
-0.34
instant
-0.34
suit
-0.33
PIL
-0.33
POSITIVE LOGITS
marginalized
0.63
vulnerables
0.62
vulnerable
0.62
featureID
0.57
ScopeManager
0.54
Vulnerable
0.54
vulné
0.52
vantaged
0.52
minority
0.51
minorities
0.50
Activations Density 0.533%