INDEX
Explanations
array or list-like structures in code
New Auto-Interp
Negative Logits
<eos>
-0.75
I
-0.64
)
-0.63
-0.62
'
-0.62
>(</
-0.62
O
-0.62
\
-0.61
]
-0.61
‘
-0.61
POSITIVE LOGITS
["
2.08
(["
2.02
['
2.00
(['
1.86
=["
1.85
=['
1.84
["
1.67
:['
1.54
',['
1.46
[['
1.40
Activations Density 0.277%