INDEX
Explanations
first-person pronouns and related verbs, indicating personal statements and actions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.23
3:0.11
4:0.22
5:0.04
6:0.09
7:0.04
8:0.03
9:0.06
10:0.05
11:0.05
Negative Logits
untled
-1.50
ACTIONS
-1.44
acting
-1.38
assemb
-1.31
ardless
-1.31
raf
-1.28
unarmed
-1.27
posing
-1.26
odge
-1.24
itting
-1.23
POSITIVE LOGITS
iest
1.33
Scrib
1.28
�
1.26
cents
1.26
tru
1.26
!.
1.24
!",
1.24
Cele
1.23
"},
1.22
favourites
1.21
Activations Density 0.005%