INDEX

Explanations

any illegal

np_acts-logits-general · gemini-2.5-flash-lite

The pattern involves definite articles ("the") and possessive determiners ("your", "his", "its", "their", "either") that indicate specific reference or ownership, often preceding nouns or noun phrases that denote concrete or abstract entities being discussed in context.

eleuther_acts_top20 · claude-4-5-sonnet Triggered by @jamesnaruto04

common grammatical function words and articles like "a," "the," "to," "of," and "be."

oai_token-act-pair · claude-4-5-sonnet Triggered by @jamesnaruto04

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_16k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 যতক্ষণ

0.16

เพื่อให้

0.16

 దాని

0.16

izability

0.16

 indiquant

0.15

 রোধ

0.15

());

0.15

nYou

0.15

utérus

0.14

导入

0.14

POSITIVE LOGITS

 protagonisti

0.20

 rival

0.18

 mainstream

0.18

 protagonists

0.18

 protagonistas

0.18

 grandes

0.17

 contenders

0.17

 major

0.17

 innovators

0.17

 acteurs

0.17

Activations Density 16.960%

any illegal

The pattern involves definite articles ("the") and possessive determiners ("your", "his", "its", "their", "either") that indicate specific reference or ownership, often preceding nouns or noun phrases that denote concrete or abstract entities being discussed in context.

common grammatical function words and articles like "a," "the," "to," "of," and "be."

No Comments

No Known Activations

any illegal

The pattern involves definite articles ("the") and possessive determiners ("your", "his", "its", "their", "either") that indicate specific reference or ownership, often preceding nouns or noun phrases that denote concrete or abstract entities being discussed in context.

common grammatical function words and articles like "a," "the," "to," "of," and "be."

No Comments

No Known Activations