INDEX
Explanations
mentions of individuals and their associated actions or feelings
New Auto-Interp
Head Attr Weights
0:0.02
1:0.03
2:0.05
3:0.14
4:0.05
5:0.09
6:0.07
7:0.05
8:0.14
9:0.10
10:0.10
11:0.11
Negative Logits
wid
-1.35
bringer
-1.33
?????-
-1.33
��
-1.27
��
-1.23
Mortal
-1.23
裏覚醒
-1.20
�
-1.20
stranger
-1.18
izoph
-1.17
POSITIVE LOGITS
graduated
1.42
defer
1.35
earned
1.30
subscribed
1.30
retiring
1.26
repaid
1.25
apologised
1.25
ilater
1.23
careers
1.23
intimid
1.21
Activations Density 0.040%