INDEX
Explanations
themes related to social justice and activism
New Auto-Interp
Negative Logits
zim
-0.14
esi
-0.14
verifier
-0.14
ÏĨο
-0.14
etur
-0.14
rok
-0.13
_lazy
-0.13
.localized
-0.13
UILT
-0.13
qualifiers
-0.13
POSITIVE LOGITS
justice
0.38
equality
0.33
fair
0.32
equal
0.31
rights
0.30
fairness
0.28
equity
0.27
Justice
0.25
ending
0.23
justice
0.23
Activations Density 0.287%