INDEX
    Explanations

    Code/technical text

    This neuron fires whenever the two-word placeholder “no input” appears (i.e. it detects the “< no input >” marker).

    New Auto-Interp
    Negative Logits
    -0.06
     welfare
    -0.06
    mandatory
    -0.06
     ps
    -0.06
     pci
    -0.06
    trash
    -0.06
    ircuit
    -0.06
     control
    -0.06
    Bay
    -0.06
    nes
    -0.06
    POSITIVE LOGITS
     ofs
    0.07
     odio
    0.06
    _Customer
    0.06
     rien
    0.06
    باح
    0.06
    _native
    0.06
     veren
    0.06
    -widgets
    0.06
    trainer
    0.06
    _matched
    0.06
    Act Density 0.002%

    No Known Activations