INDEX
    Explanations

    The neuron activates on the token “other,” effectively spotting occurrences of that word.

    New Auto-Interp
    Negative Logits
    adal
    -0.07
    -ups
    -0.07
    -up
    -0.07
    приєм
    -0.07
     the
    -0.06
    NOP
    -0.06
    assertSame
    -0.06
     himself
    -0.06
     schemas
    -0.06
    -0.06
    POSITIVE LOGITS
    itunes
    0.07
    .food
    0.06
     aliqua
    0.06
    (accounts
    0.06
     quý
    0.06
     корот
    0.06
     ITEM
    0.06
     اصلی
    0.06
     页面
    0.06
     OPT
    0.06
    Act Density 0.091%

    No Known Activations