INDEX
Explanations
themes of social justice and activism
New Auto-Interp
Negative Logits
Tel
-0.15
ange
-0.14
ature
-0.14
425
-0.14
prejudice
-0.14
ÙĨز
-0.13
iso
-0.13
caps
-0.13
fid
-0.13
nom
-0.13
POSITIVE LOGITS
cia
0.15
-action
0.15
вÑĢоп
0.14
action
0.14
dda
0.14
Pitch
0.14
ovacÃŃ
0.14
TZ
0.14
ourselves
0.13
reff
0.13
Activations Density 0.308%