Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

programming tests

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_34/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

套装

-0.74

ỗng

-0.73

等人

-0.71

ровано

-0.71

integrity

-0.69

 destacar

-0.69

pubs

-0.68

 говорить

-0.68

Namara

-0.68

Accessory

-0.68

POSITIVE LOGITS

 tests

1.50

 test

1.45

Test

1.34

测试

1.20

 Test

1.16

 Tests

1.13

テスト

1.11

 testing

1.02

TEST

1.00

 TESTS

1.00

Activations Density 0.004%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact