INDEX
Explanations
terms related to persecution and discrimination
terms related to the concept of persecution and its effects on marginalized groups
New Auto-Interp
Negative Logits
eton
-0.76
prints
-0.72
gain
-0.71
balance
-0.71
rouse
-0.70
erm
-0.70
suggest
-0.69
clus
-0.69
jac
-0.68
arger
-0.67
POSITIVE LOGITS
persecution
1.38
persecuted
1.15
persecut
1.14
Palest
0.83
oppression
0.80
ãĥķãĤ¡
0.78
retaliation
0.77
disadvant
0.74
bigotry
0.74
targeting
0.73
Activations Density 0.008%