INDEX
    Explanations

    This neuron detects intensifying adverbs and style‐instruction words that signal a request for stronger emphasis or more detailed elaboration.

    New Auto-Interp
    Negative Logits
    haven
    -0.07
    ieran
    -0.07
    oley
    -0.07
    .f
    -0.06
     PyTuple
    -0.06
    zeń
    -0.06
    erialization
    -0.06
    CPF
    -0.06
    rewrite
    -0.06
     Buccaneers
    -0.06
    POSITIVE LOGITS
     getWindow
    0.06
    ("..
    0.06
     ي
    0.06
     slowdown
    0.06
     '~
    0.06
     pesticides
    0.06
     airst
    0.06
     utils
    0.06
    lerinin
    0.06
     recession
    0.06
    Act Density 0.076%

    No Known Activations