INDEX
Explanations
references to accuracy or correctness
New Auto-Interp
Negative Logits
müſſen
-0.79
imagui
-0.76
<unused43>
-0.75
<unused23>
-0.75
<pad>
-0.75
<unused42>
-0.75
<unused51>
-0.75
<unused41>
-0.75
<unused17>
-0.75
<unused19>
-0.75
POSITIVE LOGITS
<bos>
0.69
()))
0.65
'))
0.51
"))
0.50
prints
0.47
])),
0.46
())).
0.46
()])
0.45
])
0.45
and
0.45
Activations Density 0.942%