INDEX
    Explanations

    The neuron activates only on the word “funny.”

    New Auto-Interp
    Negative Logits
    [test
    -0.07
    .obs
    -0.06
     mounted
    -0.06
    	token
    -0.06
     Ingredients
    -0.06
    _impl
    -0.06
    obao
    -0.06
     def
    -0.06
     recounted
    -0.06
     транс
    -0.06
    POSITIVE LOGITS
    ify
    0.07
     Crimes
    0.07
     mashed
    0.06
    0.06
    ій
    0.06
     coder
    0.06
    IZE
    0.06
    _fold
    0.06
     Hitler
    0.06
    _ACTIVITY
    0.06
    Act Density 0.010%

    No Known Activations