INDEX
Explanations
phrases related to social justice and human rights advocacies
references to individuals or groups who are marginalized or discriminated against
New Auto-Interp
Negative Logits
ob
-0.79
OB
-0.68
opoly
-0.67
onis
-0.67
Serpent
-0.64
bang
-0.64
PB
-0.62
Hang
-0.62
ointment
-0.61
kamp
-0.59
POSITIVE LOGITS
wishing
1.01
who
0.98
pesky
0.88
who
0.82
entrusted
0.80
kinds
0.80
favoring
0.80
affected
0.80
interested
0.79
interviewed
0.79
Activations Density 0.069%