INDEX

Explanations

only always increases To

np_acts-logits-general · gemini-2.5-flash-lite

The marked tokens appear across diverse document types (educational materials, exam questions, comprehension passages) and predominantly highlight parenthetical answer choices or their components, particularly closing parentheses, answer labels, and text within or immediately following answer options. The pattern reflects markup of multiple-choice question answer structure elements, especially focusing on option identifiers and their surrounding punctuation.

eleuther_acts_top20 · claude-4-5-haiku Triggered by @jamesnaruto04

multiple choice answer option text and content within answer choices.

oai_token-act-pair · claude-4-5-haiku Triggered by @jamesnaruto04

Incorrect answer options in multiple-choice questions within educational or assessment contexts, particularly those that contain factually wrong information, misleading statements, or mischaracterizations of concepts.

eleuther_acts_top20 · claude-4-5-sonnet Triggered by @jamesnaruto04

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_16k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

というか

0.52

 nutty

0.50

 opsi

0.47

ก็

0.47

ו

0.47

 също

0.46

 også

0.46

 också

0.45

 poignant

0.43

টাও

0.43

POSITIVE LOGITS

 secreto

0.42

提高了

0.42

 mejorar

0.42

 aumentando

0.41

 waardoor

0.41

 wyłącznie

0.38

<unused2162>

0.37

keras

0.37

＿＿

0.37

 automatiquement

0.37

Activations Density 0.091%

only always increases To

multiple choice answer option text and content within answer choices.

Incorrect answer options in multiple-choice questions within educational or assessment contexts, particularly those that contain factually wrong information, misleading statements, or mischaracterizations of concepts.

No Comments

No Known Activations

only always increases To

multiple choice answer option text and content within answer choices.

Incorrect answer options in multiple-choice questions within educational or assessment contexts, particularly those that contain factually wrong information, misleading statements, or mischaracterizations of concepts.

No Comments

No Known Activations