INDEX
Explanations
imperative verbs indicating action
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.09
3:0.08
4:0.08
5:0.07
6:0.08
7:0.09
8:0.07
9:0.08
10:0.07
11:0.07
Negative Logits
Solution
-2.21
Strategy
-2.11
Dynamic
-2.07
VERSION
-2.07
Phase
-1.94
fracturing
-1.92
vision
-1.92
fractured
-1.86
Vision
-1.78
Alternative
-1.77
POSITIVE LOGITS
bombard
2.02
ktop
1.96
ceremon
1.95
pard
1.95
oun
1.94
aunder
1.91
sacrific
1.90
butterflies
1.83
theless
1.80
士
1.80
Activations Density 0.000%