INDEX
Explanations
conditional phrases and indications of potential actions or outcomes
New Auto-Interp
Head Attr Weights
0:0.05
1:0.01
2:0.27
3:0.07
4:0.09
5:0.04
6:0.09
7:0.05
8:0.08
9:0.05
10:0.08
11:0.07
Negative Logits
jon
-1.83
ICLE
-1.78
beware
-1.77
vt
-1.51
6666
-1.46
Moder
-1.45
Û
-1.44
י
-1.44
ci
-1.42
eday
-1.42
POSITIVE LOGITS
bourg
1.63
lihood
1.54
rous
1.54
disturbance
1.49
yip
1.47
characterize
1.46
appropriate
1.44
◼
1.42
ollah
1.42
disturbances
1.41
Activations Density 0.002%