INDEX
    Explanations

    This neuron activates on words that signal a distinctive or hallmark style—e.g. “signature,” “trademark,” or “characteristic.”

    New Auto-Interp
    Negative Logits
     departments
    -0.07
     Αυ
    -0.06
    atical
    -0.06
    いか
    -0.06
     utilities
    -0.06
    .newLine
    -0.06
     Checked
    -0.06
    _related
    -0.06
     confiscated
    -0.06
     aides
    -0.06
    POSITIVE LOGITS
     BL
    0.06
     topics
    0.06
     선수
    0.06
    .basicConfig
    0.06
     BLOCK
    0.06
    .beginPath
    0.06
     deeply
    0.06
     العم
    0.05
     jot
    0.05
     GLES
    0.05
    Act Density 0.023%

    No Known Activations