INDEX
Explanations
terms and phrases related to foreign influence and military intervention
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.06
3:0.04
4:0.04
5:0.03
6:0.43
7:0.06
8:0.02
9:0.04
10:0.14
11:0.06
Negative Logits
esville
-1.47
nen
-1.34
natureconservancy
-1.33
Shift
-1.27
ribly
-1.26
ongyang
-1.25
oola
-1.23
INAL
-1.22
�
-1.21
gears
-1.21
POSITIVE LOGITS
ucl
1.31
utions
1.18
ways
1.18
ali
1.18
abroad
1.18
arters
1.17
bia
1.16
Nato
1.14
mater
1.13
aid
1.13
Activations Density 0.014%