INDEX
Explanations
terms related to civil rights and legal protections
New Auto-Interp
Negative Logits
FOUNDATION
-0.17
ÑĨен
-0.16
eneg
-0.15
cliffe
-0.15
judgment
-0.15
Ŀ
-0.15
\Php
-0.15
isd
-0.14
judgement
-0.14
.optimize
-0.13
POSITIVE LOGITS
Race
0.17
Race
0.17
race
0.15
ucas
0.15
Races
0.15
race
0.15
477
0.15
abled
0.14
Castro
0.14
abler
0.14
Activations Density 0.282%