INDEX
Explanations
situations involving struggle or effort to reclaim something lost
New Auto-Interp
Head Attr Weights
0:0.01
1:0.01
2:0.06
3:0.05
4:0.08
5:0.02
6:0.04
7:0.46
8:0.04
9:0.03
10:0.07
11:0.07
Negative Logits
Caf
-1.71
cius
-1.64
furt
-1.52
olis
-1.50
Moder
-1.45
GGGGGGGG
-1.45
abus
-1.45
Conversation
-1.40
pport
-1.40
speak
-1.39
POSITIVE LOGITS
revenge
1.83
claw
1.75
scalp
1.72
likeness
1.58
redress
1.57
paws
1.53
limb
1.48
fingert
1.48
foothold
1.48
healed
1.47
Activations Density 0.001%