INDEX
    Explanations

    Avoiding typical, ordinary

    This neuron detects requests instructing the model to avoid “generic” or “standard” answers.

    New Auto-Interp
    Negative Logits
    Distinct
    -0.07
    .springboot
    -0.07
     fruit
    -0.07
     project
    -0.06
    -0.06
    Beautiful
    -0.06
     llama
    -0.06
    /power
    -0.06
     Niger
    -0.06
     favorite
    -0.06
    POSITIVE LOGITS
     (%
    0.07
    .awtextra
    0.06
     ویژ
    0.06
     BAT
    0.06
     Đại
    0.06
     استان
    0.06
     frm
    0.06
    usz
    0.06
     ohio
    0.06
     ";"
    0.06
    Act Density 0.022%

    No Known Activations