INDEX

Explanations

code

np_max-act · gemini-2.0-flash

The neuron fires on colon characters used as key–value separators in structured data (e.g. the “users:” in “users: abc123 …”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ंय

-0.07

цы

-0.07

<Option

-0.07

/py

-0.07

 контроль

-0.07

bridge

-0.06

OfFile

-0.06

","","

-0.06

 affects

-0.06

。（

-0.06

POSITIVE LOGITS

FAILED

0.08

异常

0.07

 تست

0.07

uspend

0.06

fusc

0.06

кот

0.06

ordan

0.06

MAP

0.06

 begging

0.06

emics

0.06

Activations Density 0.407%

code

The neuron fires on colon characters used as key–value separators in structured data (e.g. the “users:” in “users: abc123 …”).

No Comments

No Known Activations