INDEX
Explanations
topics related to social equity and support for marginalized communities
New Auto-Interp
Negative Logits
olid
-0.19
èĤ²
-0.16
eko
-0.15
173
-0.15
ungal
-0.15
Äijứ
-0.15
aget
-0.15
OLID
-0.15
ante
-0.14
ADED
-0.14
POSITIVE LOGITS
minorities
0.38
minority
0.38
vulnerable
0.32
marginalized
0.30
minor
0.29
Minor
0.28
marginal
0.28
Vulner
0.28
marg
0.28
Minor
0.28
Activations Density 0.237%