Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

<body> or ```

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-pt/resid_post/layer_16_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

0.55

’

0.49

ంట్

0.45

Ꮏ

0.44

هي

0.43

ната

0.42

حان

0.42

↵

0.42

nın

0.42

әне

0.42

POSITIVE LOGITS

of

0.52

の

0.51

alu

0.50

)\

0.49

೦

0.41

}\

0.41

もの

0.41

 जून

0.41

 तोड़

0.40

0.40

Activations Density 0.000%

No Known Activations

This feature has no known activations.

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact