INDEX
Explanations
phrases expressing the emotional state and evaluation of experiences or situations
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.47
3:0.11
4:0.08
5:0.03
6:0.05
7:0.04
8:0.03
9:0.02
10:0.05
11:0.04
Negative Logits
vati
-1.63
shelters
-1.60
itaire
-1.48
ablo
-1.43
fasting
-1.41
tirelessly
-1.41
tty
-1.33
vulner
-1.32
Barron
-1.31
sacrific
-1.29
POSITIVE LOGITS
signifies
1.71
outwe
1.63
nutshell
1.53
tones
1.51
illustrates
1.49
Pearce
1.48
mean
1.47
IRD
1.38
◼
1.37
indicates
1.35
Activations Density 0.118%