INDEX

Explanations

words or roots related to specific dependencies or components in a software or programming context

oai_token-act-pair · gpt-4o-mini Triggered by @bot

letter u

np_max-act-logits · claude-3-7-sonnet-20250219 Triggered by @johnny

Two- or three-letter abbreviations

np_acts-logits-general · gemini-2.0-flash

city names and endings

np_acts-logits-general · gemini-2.5-flash-lite

rare or specialized terminology, including proper nouns, technical terms, pharmaceutical names, and scientific vocabulary.

oai_token-act-pair · claude-4-5-haiku Triggered by @kparkhamchuk

These examples contain fragments of words with selected syllables or morphemes marked, representing partial morphological units that appear within larger words across diverse technical and non-technical contexts (such as "libuv" from libraries, "NSU" from code, "edu" from place names, medical terms, and legal documents). The pattern reflects mid-word substrings that don't consistently correspond to meaningful linguistic boundaries, suggesting the selection marks text segments that may be important for tokenization, morphological analysis, or language model behavior at the subword level.

eleuther_acts_top20 · claude-4-5-haiku Triggered by @kparkhamchuk

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GEMMA-2-2B @ 20-gemmascope-res-16k

Configuration

google/gemma-scope-2b-pt-res/layer_20/width_16k/average_l0_71

Prompts (Dashboard)

36,864 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.20.hook_resid_post

Hook Layer

Architecture

jumprelu

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ValueStyle

-0.76

SharedCtor

-0.65

 complementa

-0.58

lines

-0.56

mos

-0.55

RenderAtEndOf

-0.53

twimg

-0.53

ammlung

-0.53

Threat

-0.52

ting

-0.52

POSITIVE LOGITS

uuuu

0.93

uuu

0.88

0.81

uuuuu

0.76

uu

0.72

COUVER

0.64

UUUU

0.64

du

0.63

0.57

UU

0.55

Activations Density 0.458%

words or roots related to specific dependencies or components in a software or programming context

letter u

Two- or three-letter abbreviations

city names and endings

rare or specialized terminology, including proper nouns, technical terms, pharmaceutical names, and scientific vocabulary.

No Comments

No Known Activations

words or roots related to specific dependencies or components in a software or programming context

letter u

Two- or three-letter abbreviations

city names and endings

rare or specialized terminology, including proper nouns, technical terms, pharmaceutical names, and scientific vocabulary.

No Comments

No Known Activations