INDEX
Explanations
terms related to social justice issues and discrimination
New Auto-Interp
Negative Logits
findpost
-0.72
ುವ
-0.62
XtraBars
-0.58
väg
-0.55
syn
-0.54
avy
-0.54
eterminate
-0.53
捌
-0.53
gy
-0.53
volantes
-0.53
POSITIVE LOGITS
inequality
1.98
inequalities
1.78
discrimination
1.75
Inequality
1.71
racism
1.66
equality
1.59
Discrimination
1.58
Equality
1.48
Racism
1.43
discriminatory
1.42
Activations Density 0.123%