INDEX
    Explanations

    The neuron fires on research-style emphasis words and phrases—particularly those that highlight novelty, importance, success, or key findings (e.g. “novel,” “major,” “critical,” “successful”).

    New Auto-Interp
    Negative Logits
    artists
    -0.07
     Scouts
    -0.06
     wirklich
    -0.06
    tyard
    -0.06
    رير
    -0.06
    unu
    -0.06
     Strauss
    -0.06
     Manning
    -0.06
     MOUSE
    -0.06
     mùi
    -0.06
    POSITIVE LOGITS
     imgs
    0.07
     its
    0.07
    ')));↵
    0.07
    "]);
    ↵
    0.07
    )));
    ↵
    0.06
    (session
    0.06
    τι
    0.06
    ]);↵
    0.06
    .closest
    0.06
    (dst
    0.06
    Act Density 1.168%

    No Known Activations