INDEX
Negative Logits
<bos>
-0.61
0
-0.60
don
-0.51
,
-0.50
↵↵
-0.50
k
-0.50
<eos>
-0.50
m
-0.49
ila
-0.49
+
-0.48
POSITIVE LOGITS
"])
0.79
AnchorStyles
0.78
"]
0.71
endfor
0.69
']]
0.68
']))
0.68
'])){
0.68
']);
0.67
"];
0.66
'])
0.66
Activations Density 0.004%