INDEX
Explanations
names of entities, especially those related to bills, series, or specific options within contexts
New Auto-Interp
Head Attr Weights
0:0.04
1:0.13
2:0.03
3:0.03
4:0.03
5:0.42
6:0.03
7:0.02
8:0.05
9:0.08
10:0.06
11:0.03
Negative Logits
comed
-2.31
ask
-1.97
scl
-1.84
mp
-1.82
accompan
-1.78
racial
-1.74
hat
-1.73
odi
-1.71
serv
-1.71
gam
-1.70
POSITIVE LOGITS
XVI
2.05
XII
1.97
XIV
1.90
VII
1.84
XI
1.75
twent
1.73
01
1.69
Eleven
1.67
XIII
1.65
XX
1.64
Activations Density 0.075%