INDEX
Explanations
mentions of unethical or illegal activities and conditions
New Auto-Interp
Head Attr Weights
0:0.05
1:0.07
2:0.02
3:0.08
4:0.03
5:0.08
6:0.10
7:0.35
8:0.06
9:0.05
10:0.02
11:0.04
Negative Logits
ascript
-2.81
itely
-2.16
epad
-2.16
Advis
-2.14
apter
-2.06
Panasonic
-2.02
batch
-2.00
illac
-1.98
DK
-1.96
olas
-1.96
POSITIVE LOGITS
corruption
3.20
suicides
3.11
poisoning
2.99
earthquakes
2.99
malf
2.95
thefts
2.95
crime
2.94
accidents
2.93
murder
2.88
homicides
2.88
Activations Density 0.521%