INDEX
    Explanations

    The neuron activates on the Russian first‐person pronoun “я.”

    New Auto-Interp
    Negative Logits
    ANTE
    -0.07
     chasing
    -0.06
     порів
    -0.06
     коли
    -0.06
    -0.06
    мот
    -0.06
    atre
    -0.06
    -0.06
    -0.06
    atori
    -0.06
    POSITIVE LOGITS
     cylinders
    0.07
    ..↵
    0.07
    0.07
     Democratic
    0.07
    .install
    0.07
     bal
    0.06
    archs
    0.06
    _voltage
    0.06
    idlo
    0.06
    _lifetime
    0.06
    Act Density 0.013%

    No Known Activations