INDEX
Explanations
references to notable figures and concepts related to media and politics
New Auto-Interp
Head Attr Weights
0:0.05
1:0.02
2:0.04
3:0.23
4:0.02
5:0.07
6:0.02
7:0.06
8:0.02
9:0.02
10:0.38
11:0.02
Negative Logits
progresses
-2.25
enhances
-2.16
exceeds
-2.06
stimulates
-2.04
reduces
-2.04
improves
-2.01
activates
-2.01
removes
-2.00
iversary
-1.98
izont
-1.98
POSITIVE LOGITS
whose
2.47
whom
2.36
who
2.19
.):
2.11
whose
2.03
who
2.02
ICH
1.98
complicit
1.87
$.
1.82
.]
1.81
Activations Density 0.824%