INDEX
Explanations
references to political affiliations and members of Congress
New Auto-Interp
Head Attr Weights
0:0.11
1:0.01
2:0.06
3:0.34
4:0.02
5:0.11
6:0.04
7:0.03
8:0.04
9:0.01
10:0.14
11:0.04
Negative Logits
Comput
-2.31
estyles
-2.27
Designs
-2.21
shirts
-2.13
Newsp
-2.12
ournals
-2.11
Zeit
-2.10
Machines
-2.10
Journals
-2.09
IDs
-2.06
POSITIVE LOGITS
thorn
2.37
ally
2.34
foe
2.32
tough
2.24
defender
2.19
prohib
2.17
contentious
2.17
prone
2.16
embattled
2.15
vulnerable
2.14
Activations Density 0.055%