INDEX
    Explanations

    This neuron activates on the verb “work” (including its forms like “works” or “working”).

    New Auto-Interp
    Negative Logits
     recall
    -0.08
    ícul
    -0.07
    Intensity
    -0.07
     Ad
    -0.07
     onto
    -0.06
    imating
    -0.06
     Detect
    -0.06
    。\
    -0.06
    ucle
    -0.06
    -up
    -0.06
    POSITIVE LOGITS
     working
    0.14
     worked
    0.12
     Working
    0.11
     works
    0.10
     work
    0.10
    Working
    0.09
    (work
    0.08
     collabor
    0.08
    _working
    0.08
     phil
    0.07
    Act Density 0.036%

    No Known Activations