INDEX
    Explanations

    The neuron detects the occurrence of the word “error” (notably as part of the assistant’s “If you believe this is an error…” feedback request).

    New Auto-Interp
    Negative Logits
    影響
    -0.07
    _none
    -0.06
    abbrev
    -0.06
     persists
    -0.06
    CENTER
    -0.06
     schizophren
    -0.06
     mutex
    -0.06
    post
    -0.06
    ("/");↵
    -0.06
     marking
    -0.06
    POSITIVE LOGITS
     stretched
    0.07
    _mini
    0.07
    (KP
    0.07
     그가
    0.07
     dolor
    0.06
    $core
    0.06
    (today
    0.06
    /calendar
    0.06
     DOWNLOAD
    0.06
    0.06
    Act Density 0.002%

    No Known Activations