INDEX
Explanations
characteristics related to demographics such as age, race, sex, and income
demographical factors such as age, sex, race, and ethnicity
New Auto-Interp
Negative Logits
olars
-0.80
cour
-0.80
cit
-0.77
inka
-0.74
mud
-0.74
ossus
-0.74
etsk
-0.72
roo
-0.72
BLIC
-0.70
olin
-0.70
POSITIVE LOGITS
specific
0.93
affiliation
0.92
nationality
0.85
preferences
0.84
differences
0.84
ethnicity
0.83
severity
0.81
angle
0.80
quantity
0.80
specifics
0.79
Activations Density 0.512%