INDEX
    Explanations

    The neuron consistently activates on the country name “Brazil” (and its adjectival/demonym forms like “Brazilian,” “Brasil’s,” or related place names), so it’s identifying mentions of Brazil.

    New Auto-Interp
    Negative Logits
    Amb
    -0.07
    ีข
    -0.07
    Iron
    -0.06
     Auch
    -0.06
    onacci
    -0.06
    porno
    -0.06
     DeepCopy
    -0.06
    untu
    -0.06
     longitude
    -0.06
    Telefono
    -0.06
    POSITIVE LOGITS
     Brazil
    0.11
    Brazil
    0.10
     Brazilian
    0.10
     brasile
    0.08
     Brasil
    0.08
     Janeiro
    0.07
    :url
    0.07
    0.07
     brazil
    0.07
     Bras
    0.07
    Act Density 0.018%

    No Known Activations