INDEX
    Explanations

    reality versus fiction

    This neuron activates on mentions of “human,” specifically when the text refers to the agent as a human being rather than an AI.

    New Auto-Interp
    Negative Logits
     children
    -0.07
     EP
    -0.06
    .setBackgroundResource
    -0.06
     ECB
    -0.06
     @{↵
    -0.06
     WaitForSeconds
    -0.06
    Pipe
    -0.06
     moderated
    -0.06
    _past
    -0.06
    ors
    -0.06
    POSITIVE LOGITS
    @pytest
    0.07
    *dt
    0.06
     Εθν
    0.06
    _firstname
    0.06
     Calif
    0.06
    τευ
    0.06
    foo
    0.06
    字段
    0.06
    окумент
    0.06
     QIcon
    0.06
    Act Density 0.023%

    No Known Activations