INDEX
Explanations
themes related to racial justice and community engagement
New Auto-Interp
Negative Logits
epend
-0.16
μί
-0.15
رÙĪØ²
-0.15
ayah
-0.15
à¹Ģà¸ŀล
-0.14
greso
-0.14
pps
-0.14
dik
-0.13
eba
-0.13
adero
-0.13
POSITIVE LOGITS
problem
0.28
interrog
0.27
gr
0.27
decent
0.27
question
0.25
foreground
0.24
trouble
0.24
privilege
0.23
challenge
0.23
privileges
0.22
Activations Density 0.161%