INDEX
Explanations
Code/technical text
This neuron fires whenever the two-word placeholder “no input” appears (i.e. it detects the “< no input >” marker).
New Auto-Interp
Negative Logits
丶
-0.06
welfare
-0.06
mandatory
-0.06
ps
-0.06
pci
-0.06
trash
-0.06
ircuit
-0.06
control
-0.06
Bay
-0.06
nes
-0.06
POSITIVE LOGITS
ofs
0.07
odio
0.06
_Customer
0.06
rien
0.06
باح
0.06
_native
0.06
veren
0.06
-widgets
0.06
trainer
0.06
_matched
0.06
Activations Density 0.002%