INDEX
Explanations
words related to institutional systems and policies
terms related to systemic or institutional issues and abuses
New Auto-Interp
Negative Logits
nen
-0.86
uden
-0.76
cius
-0.76
spell
-0.73
llan
-0.72
vous
-0.72
bane
-0.71
hammad
-0.70
aday
-0.68
ertodd
-0.67
POSITIVE LOGITS
ized
1.44
ised
1.26
ization
1.18
izational
1.15
izing
1.12
ize
1.08
izes
1.06
racism
1.04
ising
1.03
isation
1.02
Activations Density 0.048%