INDEX
    Explanations

    This neuron fires on prominent content nouns that label key entities or topics (e.g. “method,” “machine,” “City,” “College,” “dance”).

    New Auto-Interp
    Negative Logits
    											
    -0.07
     declared
    -0.07
    -0.06
     certifications
    -0.06
     praying
    -0.06
     ------------------------------------------------------------------------↵
    -0.06
    andWhere
    -0.06
    Й
    -0.06
    ARRY
    -0.06
    CLUSIVE
    -0.06
    POSITIVE LOGITS
    .slide
    0.07
     chlorine
    0.07
    ponsors
    0.06
     Leave
    0.06
    -Pack
    0.06
     vent
    0.06
    czas
    0.06
    -dimensional
    0.06
    حث
    0.06
     civilization
    0.06
    Act Density 0.214%

    No Known Activations