INDEX
    Explanations

    The neuron fires on placeholder tokens that label character names (e.g. “NAME_1,” “NAME_2,” etc.).

    New Auto-Interp
    Negative Logits
     mpz
    -0.06
    plat
    -0.06
    (Cs
    -0.06
     reflux
    -0.06
    paypal
    -0.06
    untu
    -0.06
     productService
    -0.06
     icmp
    -0.06
     Balt
    -0.06
     labelText
    -0.06
    POSITIVE LOGITS
    сих
    0.07
    0.06
     surrounds
    0.06
     hundred
    0.06
     waiting
    0.06
    ünde
    0.06
    -related
    0.06
    character
    0.06
    }".
    0.06
     yeah
    0.06
    Act Density 0.019%

    No Known Activations