INDEX
Explanations
references to specific political organizations or positions within government
New Auto-Interp
Head Attr Weights
0:0.03
1:0.04
2:0.09
3:0.12
4:0.03
5:0.03
6:0.20
7:0.13
8:0.06
9:0.05
10:0.08
11:0.10
Negative Logits
Monroe
-1.15
@@
-1.14
vertisements
-1.14
ERSON
-1.13
ircraft
-1.13
ionage
-1.10
idious
-1.10
LOG
-1.09
PDATE
-1.07
rican
-1.05
POSITIVE LOGITS
utra
1.32
ews
1.08
holders
1.06
gravy
1.05
Provincial
1.03
の�
1.01
constituency
0.99
jriwal
0.98
outh
0.96
iser
0.95
Activations Density 0.001%