INDEX
    Explanations

    conditions for life

    This neuron detects ordinary text tokens in the assistant’s generated answer passages (i.e. non‐header, non‐control words in the assistant’s prose).

    New Auto-Interp
    Negative Logits
    буд
    -0.07
    -0.07
    Interop
    -0.07
    -0.07
    IST
    -0.06
    -0.06
    _app
    -0.06
    Color
    -0.06
    -0.06
    sector
    -0.06
    POSITIVE LOGITS
    (simp
    0.07
    0.07
    ighted
    0.07
    rix
    0.06
    virt
    0.06
    ترنت
    0.06
     Slack
    0.06
     Bahrain
    0.06
    jes
    0.06
    0.06
    Act Density 0.013%

    No Known Activations