INDEX

Explanations

references to social and racial issues, particularly concerning white privilege and reparations

oai_token-act-pair · gpt-4o-mini Triggered by @bot

This neuron appears to be detecting text from diverse contexts (legal documents, political commentary, shopping forums, educational content) without a clear coherent pattern, suggesting it may be misfiring or detecting a spurious correlation rather than identifying a meaningful linguistic feature.

oai_token-act-pair · claude-4-5-haiku Triggered by @emiglarou

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GEMMA-2-9B @ 31-gemmascope-res-16k

Configuration

google/gemma-scope-9b-pt-res/layer_31/width_16k/average_l0_114

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.31.hook_resid_post

Hook Layer

Architecture

jumprelu

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 rumahnya

-0.35

 Wassers

-0.35

 ibunya

-0.34

 istrinya

-0.33

 ferner

-0.33

 Njema

-0.31

 alguno

-0.31

 quelcon

-0.30

 hujan

-0.30

 nämlich

-0.30

POSITIVE LOGITS

MigrationBuilder

1.19

 betweenstory

0.97

WebElementEntity

0.92

 című

0.91

tagHelperRunner

0.90

Autoritní

0.84

 للمعارف

0.81

 становника

0.80

 ब्रेकडाउन

0.79

 zwiſchen

0.77

Activations Density 2.203%

references to social and racial issues, particularly concerning white privilege and reparations

This neuron appears to be detecting text from diverse contexts (legal documents, political commentary, shopping forums, educational content) without a clear coherent pattern, suggesting it may be misfiring or detecting a spurious correlation rather than identifying a meaningful linguistic feature.

No Comments

No Known Activations