INDEX

Explanations

politicians from various countries

np_max-act · gemini-2.0-flash

This neuron fires on mentions of country names (e.g. “Japan,” “Mexico,” “Norway,” etc.), especially in category or parenthetical contexts.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 convinc

-0.08

 assistance

-0.08

 Lazar

-0.06

Validity

-0.06

==========

-0.06

eliminar

-0.06

lui

-0.06

 Philipp

-0.06

Inform

-0.06

 Positive

-0.06

POSITIVE LOGITS

?“↵↵

0.07

اضي

0.06

 Differences

0.06

>List

0.06

embedded

0.06

�i

0.06



0.06

牙

0.06

_Two

0.06

麦

0.06

Activations Density 0.005%

politicians from various countries

This neuron fires on mentions of country names (e.g. “Japan,” “Mexico,” “Norway,” etc.), especially in category or parenthetical contexts.

No Comments

No Known Activations