INDEX
Explanations
brackets
The neuron fires on the opening square bracket token (“[”).
New Auto-Interp
Negative Logits
Commons
-0.06
nozzle
-0.06
commanded
-0.06
�
-0.06
�p
-0.06
\:
-0.06
δει
-0.06
loosen
-0.06
tactical
-0.06
violent
-0.06
POSITIVE LOGITS
metadata
0.08
""↵
0.07
').↵
0.07
.shape
0.07
*/ ↵
0.07
uncertainty
0.07
adel
0.07
ertainty
0.07
>"↵
0.07
'↵
0.06
Activations Density 0.009%