INDEX
Explanations
concepts related to racial discrimination and privilege
New Auto-Interp
Negative Logits
æ¿
-0.17
esy
-0.15
anoi
-0.15
üy
-0.15
STYPE
-0.14
ÃĴ
-0.14
ائÙĬ
-0.13
;element
-0.13
898
-0.13
igin
-0.13
POSITIVE LOGITS
systems
0.29
structural
0.28
patri
0.28
privilege
0.27
Patri
0.27
systemic
0.26
systems
0.25
Systems
0.25
Structural
0.25
structures
0.24
Activations Density 0.352%