INDEX
    Explanations

    The neuron flags adverbial “-ly” modifiers—especially emotion/manner adverbs like “excitedly” in stage directions.

    New Auto-Interp
    Negative Logits
     risk
    -0.07
     Compensation
    -0.06
    OSP
    -0.06
    Cost
    -0.06
     Liqu
    -0.06
     golf
    -0.06
     similarity
    -0.06
     Osman
    -0.06
     Anton
    -0.06
     Logic
    -0.06
    POSITIVE LOGITS
     excited
    0.09
    ään
    0.07
     بسی
    0.07
     herkes
    0.07
     patriotic
    0.07
    0.07
     massa
    0.07
     hiç
    0.07
    !
    0.07
     thrilled
    0.06
    Act Density 0.033%

    No Known Activations