Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

giải thích thuật ngữ

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_40_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

䧺

0.43

اؤ

0.42

 باوجود

0.40

 mutlaka

0.40

 memberi

0.40

zwischen

0.40

র্মে

0.39

culture

0.39

 Dienste

0.39

 produisent

0.38

POSITIVE LOGITS

 correct

0.47

相关

0.46

 문제

0.44

：

0.44

 correcta

0.42

慵

0.41

一下

0.40

如下

0.40

事情

0.40

 adverb

0.39

Activations Density 0.000%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact