INDEX
    Explanations

    This neuron activates specifically on the token “break.”

    New Auto-Interp
    Negative Logits
     "{}
    -0.07
     Laden
    -0.06
    @property
    -0.06
    ोह
    -0.06
    ite
    -0.06
    (){↵
    -0.06
     oldest
    -0.06
    новид
    -0.06
    adi
    -0.06
     Netflix
    -0.06
    POSITIVE LOGITS
    JUnit
    0.07
     Albuquerque
    0.07
    .Margin
    0.07
    =g
    0.06
    .divide
    0.06
    balances
    0.06
     Explicit
    0.06
    ategor
    0.06
    ],[-
    0.06
     апреля
    0.06
    Act Density 0.001%

    No Known Activations