INDEX

Explanations

Questions about specifics

np_max-act · gemini-2.0-flash

greetings and inquiries about assistance or help.

oai_token-act-pair · gpt-4o-mini Triggered by @xinyanhu8

This neuron fires on words used when asking for clarification or specifying details in a request (e.g., “what,” “specific,” “tasks”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 becer

-0.07

」的

-0.07

 Legendary

-0.06

 전에

-0.06

 leur

-0.06

난

-0.06

(numero

-0.06

_frontend

-0.06

Wor

-0.06

 metro

-0.06

POSITIVE LOGITS

 conjunto

0.07

elem

0.06

NSA

0.06

.Remote

0.06

 directions

0.06

 spokesman

0.06

adium

0.06

 рівня

0.06

偶

0.06

rod

0.06

Activations Density 0.036%

Questions about specifics

greetings and inquiries about assistance or help.

This neuron fires on words used when asking for clarification or specifying details in a request (e.g., “what,” “specific,” “tasks”).

No Comments

No Known Activations

Questions about specifics

greetings and inquiries about assistance or help.

This neuron fires on words used when asking for clarification or specifying details in a request (e.g., “what,” “specific,” “tasks”).

No Comments

No Known Activations