INDEX
Explanations
references to racism and its social implications
New Auto-Interp
Negative Logits
Christmas
-0.15
å¾³
-0.15
christmas
-0.15
otre
-0.14
osti
-0.14
bakan
-0.14
ushort
-0.14
implify
-0.14
ilde
-0.14
rove
-0.14
POSITIVE LOGITS
police
0.19
Floyd
0.18
racial
0.18
racism
0.18
Black
0.17
Police
0.17
race
0.16
unity
0.16
policing
0.15
white
0.15
Activations Density 0.079%