INDEX
Explanations
references to race, specifically the terms related to "whites" and "blacks."
whites and blacks discussions
New Auto-Interp
Negative Logits
Mik
-0.54
[
-0.52
Py
-0.52
,
-0.48
-
-0.47
Pin
-0.47
Pe
-0.47
Kam
-0.46
nutri
-0.45
Van
-0.45
POSITIVE LOGITS
Whites
1.42
Whites
1.38
whites
1.36
whites
1.13
Blacks
0.91
blacks
0.91
highs
0.76
reds
0.72
ambién
0.66
desmotivaciones
0.66
Activations Density 0.014%