INDEX
Explanations
phrases related to victims of various types of harm or oppression
references to victims of various crimes and traumatic events
New Auto-Interp
Negative Logits
ulum
-0.77
arily
-0.71
ahead
-0.71
hua
-0.68
heads
-0.68
prints
-0.67
ortment
-0.67
shire
-0.67
agy
-0.67
liness
-0.67
POSITIVE LOGITS
persecution
0.83
injustice
0.83
discrimination
0.80
harassment
0.80
oppression
0.78
violence
0.78
abuse
0.77
gunfire
0.74
Hurricane
0.73
racism
0.72
Activations Density 0.085%