INDEX
Explanations
words related to social justice, equity, and discrimination, especially focusing on the experiences of marginalized communities
references to marginalized communities, particularly focusing on issues affecting Black and transgender individuals
New Auto-Interp
Negative Logits
HL
-0.79
yss
-0.75
USS
-0.71
UMP
-0.69
staff
-0.69
LOCK
-0.65
inventoryQuantity
-0.64
åĬ
-0.61
uner
-0.61
PASS
-0.61
POSITIVE LOGITS
who
0.88
folk
0.79
arettes
0.76
paces
0.75
hood
0.73
everywhere
0.71
ervatives
0.70
whom
0.69
marry
0.69
olics
0.67
Activations Density 0.104%