INDEX
    Explanations

    The neuron activates when the word "people" appears. The subsequent tokens suggest descriptions or actions related to people, such as "jog" or "working". The positive logits show a diverse set of potential associations, but the most direct and unifying pattern is the presence of "people" itself.Therefore, the most specific and accurate explanation based on the clear pattern in `MAX_ACTIVATING_TOKENS` and the common context in `TOP_ACTIVATING_TEXTS` is simply the word "people".Explanation: people

    New Auto-Interp
    Negative Logits
    ag
    1.64
     nghiệm
    1.46
    ig
    1.39
    ut
    1.38
    aj
    1.36
    Alright
    1.31
     мате
    1.30
    めに
    1.29
    IN
    1.27
    ں
    1.27
    POSITIVE LOGITS
    ли
    1.51
    ление
    1.50
    hips
    1.43
     publiques
    1.42
    affiliated
    1.36
     autochtones
    1.34
     Algebras
    1.32
     olefins
    1.32
     nonexpansive
    1.30
     parishes
    1.27
    Act Density 0.307%

    No Known Activations