INDEX
Explanations
pronouns and verbs indicating ongoing actions
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.22
3:0.23
4:0.06
5:0.13
6:0.01
7:0.02
8:0.11
9:0.08
10:0.03
11:0.01
Negative Logits
י
-1.34
י�
-1.26
Recommend
-1.23
Rescue
-1.22
Message
-1.18
Ur
-1.12
Language
-1.11
Ur
-1.08
Wr
-1.07
Gram
-1.06
POSITIVE LOGITS
anwhile
1.56
yip
1.40
nown
1.40
rily
1.34
phthal
1.29
ensibly
1.26
acknow
1.26
isSpecialOrderable
1.25
hyde
1.23
pretended
1.21
Activations Density 0.067%