Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

fear

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 fear

-1.19

 feared

-1.12

fear

-1.11

 fearing

-1.09

 afraid

-1.09

 fearful

-1.09

 Fear

-1.08

Fear

-1.08

RegressionTest

-1.08

 fears

-0.99

POSITIVE LOGITS

ful

0.75

of

0.60

0.55

fully

0.47

halb

0.45

ver

0.44

 about

0.43

full

0.43

hof

0.43

ure

0.42

Activations Density 0.029%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact