INDEX
    Explanations

    The neuron detects formatting and presentation instructions—imperative verbs that tell the model how to extract or output information.

    New Auto-Interp
    Negative Logits
    .er
    -0.06
    extr
    -0.06
    _Default
    -0.06
    (IC
    -0.06
     parametro
    -0.06
    rh
    -0.06
    ;c
    -0.06
     booklet
    -0.06
    bler
    -0.06
    場所
    -0.06
    POSITIVE LOGITS
     شاهد
    0.07
    }`;↵↵
    0.07
    0.07
     Daemon
    0.06
    0.06
    [arg
    0.06
     [
    ↵
    0.06
    яет
    0.06
     удив
    0.06
    #[
    0.06
    Act Density 0.083%

    No Known Activations