INDEX
    Explanations

    The neuron flags mentions of consuming or experiencing media—tokens like “watch(ed),” “read,” “anime,” “movies,” “books,” “serials,” etc.

    New Auto-Interp
    Negative Logits
     labor
    -0.07
    .ro
    -0.07
    -0.07
     ra
    -0.07
    からない
    -0.07
     pause
    -0.07
     induction
    -0.07
    .getAccount
    -0.06
    parate
    -0.06
     mad
    -0.06
    POSITIVE LOGITS
     Decl
    0.06
    iefs
    0.06
     GHC
    0.06
    ansen
    0.06
    /res
    0.06
    (CONT
    0.06
    ández
    0.06
    ,and
    0.06
    věř
    0.06
     péri
    0.06
    Act Density 0.132%

    No Known Activations