INDEX
Explanations
failure errors
This neuron fires on words indicating task status changes or errors (e.g. failed, suspended, error).
New Auto-Interp
Negative Logits
vend
-0.06
_variables
-0.06
_eng
-0.06
(pro
-0.06
rozhod
-0.06
(red
-0.06
verbess
-0.06
-ending
-0.06
�
-0.06
(token
-0.06
POSITIVE LOGITS
mennes
0.07
/>)↵
0.07
ADED
0.07
Thinking
0.06
ctrine
0.06
Feeling
0.06
Letters
0.06
prayed
0.06
ocytes
0.06
Published
0.06
Activations Density 0.031%