INDEX
Explanations
mentions of key figures and entities related to human rights violations
New Auto-Interp
Head Attr Weights
0:0.07
1:0.02
2:0.28
3:0.07
4:0.17
5:0.05
6:0.02
7:0.03
8:0.08
9:0.09
10:0.05
11:0.02
Negative Logits
SPONSORED
-1.68
eleph
-1.36
OIL
-1.24
VW
-1.23
convol
-1.23
citiz
-1.22
etheless
-1.21
esse
-1.15
sylv
-1.14
RIC
-1.11
POSITIVE LOGITS
geist
1.63
opol
1.33
anski
1.24
idon
1.20
axis
1.19
stals
1.19
Reloaded
1.18
endor
1.17
zman
1.16
henko
1.16
Activations Density 0.006%