INDEX
Explanations
key moments or highlights in a narrative
New Auto-Interp
Head Attr Weights
0:0.01
1:0.02
2:0.11
3:0.05
4:0.10
5:0.02
6:0.04
7:0.40
8:0.02
9:0.03
10:0.09
11:0.07
Negative Logits
consent
-1.89
uyomi
-1.68
inately
-1.58
ascript
-1.55
conservancy
-1.54
ependent
-1.53
handle
-1.52
separ
-1.49
ensibly
-1.46
Consent
-1.46
POSITIVE LOGITS
Congratulations
1.54
shows
1.52
hitting
1.51
highlighting
1.50
lifting
1.47
showcasing
1.47
Buzz
1.45
highlights
1.43
elight
1.42
hitting
1.42
Activations Density 0.002%