INDEX
    Explanations

    describing explaining reading

    The neuron strongly activates on the occurrence of the tokens “description” or “descriptions,” i.e. when the text is issuing instructions to describe or give descriptions.

    New Auto-Interp
    Negative Logits
     NSK
    -0.07
    reatment
    -0.06
    ्यव
    -0.06
     vybav
    -0.06
     commanded
    -0.06
     Commander
    -0.06
    这样
    -0.06
     blowjob
    -0.06
     whip
    -0.06
    CARD
    -0.06
    POSITIVE LOGITS
    <script
    0.07
    \Client
    0.07
     "/
    0.07
    (min
    0.07
    CFG
    0.06
     tub
    0.06
     Gong
    0.06
    /train
    0.06
    akin
    0.06
    zhou
    0.06
    Act Density 0.010%

    No Known Activations