INDEX
    Explanations

    The neuron activates on words and phrases expressing uncertainty or asking how to proceed (e.g. “how,” “approach,” “not sure”), i.e. question-framing language about tackling the problem.

    New Auto-Interp
    Negative Logits
     karar
    -0.06
    ervoir
    -0.06
    anı
    -0.06
     outings
    -0.06
    ันทร
    -0.06
    能源
    -0.06
     keinen
    -0.06
    EFAULT
    -0.06
    *:
    -0.06
    SequentialGroup
    -0.06
    POSITIVE LOGITS
     disobed
    0.07
    기간
    0.06
    ???↵↵
    0.06
    CERT
    0.06
    0.06
    0.06
    (HttpStatus
    0.06
    dou
    0.06
    cro
    0.06
     Venus
    0.06
    Act Density 0.025%

    No Known Activations