INDEX
Explanations
topics related to marginalized communities and their struggles
New Auto-Interp
Negative Logits
estro
-0.17
loh
-0.16
arez
-0.15
433
-0.15
ezi
-0.14
rlen
-0.14
373
-0.14
basket
-0.14
aze
-0.13
mdir
-0.13
POSITIVE LOGITS
groups
0.25
populations
0.23
-groups
0.19
sectors
0.18
Groups
0.17
segments
0.17
classes
0.17
special
0.17
group
0.17
protected
0.16
Activations Density 0.086%