INDEX
Explanations
expressions of sadness and regret
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.29
3:0.12
4:0.10
5:0.03
6:0.04
7:0.12
8:0.04
9:0.03
10:0.06
11:0.07
Negative Logits
skills
-1.91
ograms
-1.85
likes
-1.79
ogram
-1.62
flair
-1.62
ヘ
-1.60
rities
-1.55
preference
-1.53
�
-1.50
IQ
-1.49
POSITIVE LOGITS
ilst
1.81
Stra
1.66
abis
1.52
haus
1.52
SECTION
1.49
olid
1.48
dy
1.43
onel
1.43
theless
1.38
held
1.37
Activations Density 0.001%