INDEX
    Explanations

    The neuron activates on tokens that indicate making or applying modifications (e.g., “make,” “change,” “patch,” “modify,” “adjustment,” “rotation”).

    New Auto-Interp
    Negative Logits
    Technical
    -0.07
    PLAYER
    -0.07
    %%
    -0.06
     disdain
    -0.06
    .End
    -0.06
    UI
    -0.06
     IsNot
    -0.06
     API
    -0.06
    tty
    -0.06
    /ajax
    -0.06
    POSITIVE LOGITS
     практически
    0.07
    0.07
    cosa
    0.07
    0.06
     reaction
    0.06
     thrill
    0.06
     ruth
    0.06
    افت
    0.06
     flew
    0.06
     bottle
    0.06
    Act Density 0.088%

    No Known Activations