INDEX

Explanations

=

np_max-act · gemini-2.0-flash

The neuron primarily fires on numeric literal tokens (e.g. integer or floating‐point constants) in code.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

programming-related code snippets and function definitions.

oai_token-act-pair · gpt-4o-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Race

-0.06

 subur

-0.06

 Dünya

-0.06

ser

-0.06

PCODE

-0.06

-war

-0.06

�

-0.06

 دنی

-0.06

 histograms

-0.05

İY

-0.05

POSITIVE LOGITS

τικής

0.07

 tohoto

0.07

॰

0.07

.usermodel

0.07

 gather

0.07

 있음

0.06

 всей

0.06

 dní

0.06

 correctness

0.06

blk

0.06

Activations Density 0.049%

=

The neuron primarily fires on numeric literal tokens (e.g. integer or floating‐point constants) in code.

programming-related code snippets and function definitions.

No Comments

No Known Activations

=

The neuron primarily fires on numeric literal tokens (e.g. integer or floating‐point constants) in code.

programming-related code snippets and function definitions.

No Comments

No Known Activations