INDEX
Explanations
names of individuals or organizations mentioned in the text
names of political or governmental leaders and their affiliations
New Auto-Interp
Head Attr Weights
0:0.08
1:0.02
2:0.29
3:0.07
4:0.15
5:0.04
6:0.04
7:0.04
8:0.06
9:0.07
10:0.06
11:0.03
Negative Logits
cause
-1.24
Circus
-1.10
atform
-1.01
berus
-1.00
cms
-0.99
��
-0.98
Rove
-0.98
caf
-0.98
linkage
-0.96
ITCH
-0.95
POSITIVE LOGITS
kaya
1.42
uddin
1.31
idis
1.30
iman
1.21
irts
1.10
idge
1.09
ensen
1.08
û
1.08
Exploration
1.08
eed
1.07
Activations Density 0.008%