INDEX
Explanations
references to deadlines and time-sensitive events
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.09
3:0.06
4:0.07
5:0.06
6:0.04
7:0.08
8:0.04
9:0.04
10:0.30
11:0.14
Negative Logits
Prol
-1.38
trademarks
-1.36
THER
-1.34
Hezbollah
-1.25
videos
-1.23
Proud
-1.23
xen
-1.22
ocally
-1.18
Experiment
-1.18
guns
-1.16
POSITIVE LOGITS
ticket
1.53
postpone
1.44
vu
1.40
snag
1.37
cutoff
1.37
breaker
1.37
hurdle
1.35
req
1.34
deadline
1.32
Timeout
1.32
Activations Density 0.007%