INDEX

Explanations

legislation and controversy

np_acts-logits-general · gemini-2.5-flash-lite

This neuron detects language signaling falsehoods or deceptive claims (e.g., words like “false,” “mislead,” “lying,” “harmful,” “drastic cuts”).

oai_token-act-pair · o4-mini Triggered by @jyhe0408

political or advocacy text about policy issues, legislation, and government actions.

oai_token-act-pair · claude-4-5-sonnet Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 स्क्रीन

-1.53

 formales

-1.48

 प्रोड

-1.43

ッチリ

-1.29

 frischen

-1.28

ᴢ

-1.27

 عاشقانه

-1.27

 جميلة

-1.26

 قهوه

-1.25

 evtl

-1.24

POSITIVE LOGITS

our

1.88

 использовать

1.43

ִּ

1.41

וֹ

1.38

ּוֹ

1.38

amatkan

1.33

 边框

1.30

 المثال

1.30

 كمان

1.30

ciri

1.30

Activations Density 0.139%

legislation and controversy

This neuron detects language signaling falsehoods or deceptive claims (e.g., words like “false,” “mislead,” “lying,” “harmful,” “drastic cuts”).

political or advocacy text about policy issues, legislation, and government actions.

No Comments

No Known Activations

legislation and controversy

This neuron detects language signaling falsehoods or deceptive claims (e.g., words like “false,” “mislead,” “lying,” “harmful,” “drastic cuts”).

political or advocacy text about policy issues, legislation, and government actions.

No Comments

No Known Activations