INDEX
Explanations
expressions of patience and requests for understanding from others
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.06
3:0.06
4:0.08
5:0.03
6:0.04
7:0.41
8:0.05
9:0.03
10:0.08
11:0.06
Negative Logits
ccording
-1.80
sophistication
-1.64
ificantly
-1.64
CLUS
-1.57
ONSORED
-1.56
pestic
-1.56
uliffe
-1.54
nesota
-1.48
restrial
-1.48
ceivable
-1.45
POSITIVE LOGITS
patiently
2.12
lest
1.70
wait
1.65
till
1.62
waited
1.52
til
1.47
clamp
1.47
drum
1.45
compr
1.44
laus
1.44
Activations Density 0.002%