INDEX
Explanations
expressions related to emotional struggles and acceptance of loss
New Auto-Interp
Head Attr Weights
0:0.09
1:0.04
2:0.37
3:0.07
4:0.04
5:0.07
6:0.05
7:0.06
8:0.03
9:0.08
10:0.04
11:0.02
Negative Logits
quart
-2.65
sty
-2.55
PRES
-2.43
�
-2.37
espresso
-2.33
inaug
-2.31
Stretch
-2.31
spec
-2.28
peppers
-2.28
undercover
-2.27
POSITIVE LOGITS
losses
6.99
Loss
6.74
loss
6.61
loss
6.21
lost
5.75
losing
5.62
lost
5.21
lose
5.13
losers
5.09
Lose
4.91
Activations Density 0.134%