Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Taylor

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_34/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Schwangerschaft

-0.76

ִּ

-0.75

 במח

-0.74

 THANKS

-0.70

Agregar

-0.70

 Yesus

-0.69

Ʒ

-0.68

 bạo

-0.68

chevron

-0.68

চ

-0.66

POSITIVE LOGITS

Taylor

0.82

くらい

0.80

≯

0.80

iens

0.80

 niem

0.79

 TAYLOR

0.76

Tay

0.76

 taylor

0.76

ロナ

0.75

chaften

0.75

Activations Density 0.016%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact