INDEX

Explanations

force

np_max-act-logits · gemini-2.5-flash-lite

Based on the activation patterns across all the text samples, this neuron activates strongly on first-person narrative perspective and introspective emotional states, particularly when characters are processing complex feelings, memories, or moments of vulnerability. The neuron shows high activations on pronouns like "I

oai_token-act-pair · claude-4-5-haiku Triggered by @vahramatayan

narrative text indicating first-person perspective or character actions, particularly in role-play, dialogue, or story contexts.

oai_token-act-pair · claude-4-5-sonnet Triggered by @vahramatayan

New Auto-Interp

Configuration

google/gemma-scope-2-4b-it/transcoder_all/layer_25_width_262k_l0_small_affine

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Conteudos

0.75

 Toward

0.74

antam

0.68

Toward

0.68

 Towards

0.67

 Townsend

0.67

ainted

0.66

Horn

0.65

 וב

0.65

 towards

0.65

POSITIVE LOGITS

 forcing

0.83

 force

0.79

 apre

0.76

 fingerprint

0.76

 FORCE

0.74

 resett

0.71

forcing

0.71

 base

0.70

 শ্বশুর

0.70

 Force

0.68

Activations Density 0.195%

force

narrative text indicating first-person perspective or character actions, particularly in role-play, dialogue, or story contexts.

No Comments

No Known Activations

force

narrative text indicating first-person perspective or character actions, particularly in role-play, dialogue, or story contexts.

No Comments

No Known Activations