INDEX
    Explanations

    The neuron consistently activates on the token “trunk” (including its forms like “trunked”) regardless of context.

    New Auto-Interp
    Negative Logits
     distributors
    -0.07
    atings
    -0.07
    _syntax
    -0.07
     Screens
    -0.06
     arenas
    -0.06
    Eval
    -0.06
    },"
    -0.06
    Beat
    -0.06
     cách
    -0.06
     Nay
    -0.06
    POSITIVE LOGITS
     trunk
    0.14
    /trunk
    0.08
     torso
    0.08
     plank
    0.07
    usk
    0.07
    ken
    0.07
     Root
    0.07
     recom
    0.07
     unde
    0.07
     üzerinden
    0.07
    Act Density 0.002%

    No Known Activations