INDEX
Explanations
mentions of minority groups
references to minority groups and their issues
New Auto-Interp
Negative Logits
hran
-0.86
DCS
-0.86
lov
-0.85
atche
-0.82
sis
-0.80
æ©
-0.78
hiba
-0.77
ensional
-0.75
rir
-0.75
dra
-0.71
POSITIVE LOGITS
minority
0.93
minorities
0.83
genders
0.81
males
0.81
populations
0.78
quarters
0.77
ativity
0.76
groups
0.74
peoples
0.73
faiths
0.72
Activations Density 0.014%