INDEX
Explanations
phrases related to victims of various forms of harm or injustice
mentions of victims in various contexts
New Auto-Interp
Negative Logits
leaf
-0.71
ATIONAL
-0.69
66666666
-0.66
GGGG
-0.66
monarchy
-0.65
Plat
-0.65
CLASSIFIED
-0.64
obar
-0.63
Sau
-0.62
arp
-0.61
POSITIVE LOGITS
Victims
1.17
victims
1.13
Survivors
0.94
Victim
0.94
victim
0.85
survivors
0.84
inflicted
0.83
victimized
0.83
vict
0.79
Survivor
0.76
Activations Density 0.013%