INDEX
Explanations
expressions of reluctance or unwillingness
New Auto-Interp
Head Attr Weights
0:0.07
1:0.02
2:0.11
3:0.09
4:0.25
5:0.05
6:0.03
7:0.02
8:0.11
9:0.13
10:0.04
11:0.02
Negative Logits
ILCS
-1.48
minus
-1.37
jiang
-1.36
rav
-1.27
ulture
-1.25
Maps
-1.24
annot
-1.23
GD
-1.18
版
-1.16
Neuroscience
-1.16
POSITIVE LOGITS
timid
1.50
comprom
1.35
dissu
1.28
clinging
1.25
pursuing
1.22
acquies
1.21
anything
1.20
nor
1.20
reluctant
1.19
hesitant
1.18
Activations Density 0.011%