INDEX
Explanations
attends to the token "printk" from structural tokens
New Auto-Interp
Head Attr Weights
0:0.01
1:0.02
2:0.07
3:0.60
4:0.05
5:0.07
6:0.07
7:0.07
Negative Logits
myſelf
-1.29
pleaſure
-1.22
itſelf
-1.21
Monfieur
-1.19
ſelf
-1.18
Majefty
-1.17
―――――
-1.14
purpoſe
-1.12
Jefus
-1.12
ſtate
-1.11
POSITIVE LOGITS
0.64
A
0.56
I
0.53
E
0.52
B
0.52
T
0.51
,
0.51
N
0.51
F
0.51
0.51
Activations Density 0.025%