INDEX
    Explanations

    movie credits

    The neuron consistently activates on personal names (e.g., directors, actors, producers) in the text.

    New Auto-Interp
    Negative Logits
     замет
    -0.07
     NONE
    -0.06
    namese
    -0.06
    ैश
    -0.06
     Kit
    -0.06
     nuclear
    -0.06
     любой
    -0.06
    CBC
    -0.06
    ीं,
    -0.06
     infiltration
    -0.06
    POSITIVE LOGITS
     jose
    0.06
    weigh
    0.06
     nue
    0.06
    InnerHTML
    0.06
     discrim
    0.06
    0.06
    žení
    0.06
    فة
    0.06
    <↵
    0.06
    interested
    0.06
    Act Density 0.032%

    No Known Activations