INDEX

Explanations

references to being "behind" something, particularly in contexts implying secrecy or concealment

oai_token-act-pair · gpt-4o-mini Triggered by @bot

"behind" followed by a preposition/the

np_acts-logits-general · gemini-2.0-flash

behind the scenes

np_acts-logits-general · gemini-2.5-flash-lite

The phrase "behind" (or "behind the") appears consistently across examples to convey the meaning of something occurring out of public view, in private, or not visible to observers. This includes idiomatic uses like "behind closed doors" (private interrogation or decision-making), "behind the scenes" (hidden processes or preparation work), and literal spatial uses like "behind the counter" or "behind the wheel" (positioned at a location). The pattern reflects how "behind" functions as a preposition denoting concealment, privacy, or a position not immediately apparent to an audience.

eleuther_acts_top20 · claude-4-5-haiku Triggered by @kparkhamchuk

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GEMMA-2-2B @ 20-gemmascope-res-16k

Configuration

google/gemma-scope-2b-pt-res/layer_20/width_16k/average_l0_71

Prompts (Dashboard)

36,864 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.20.hook_resid_post

Hook Layer

Architecture

jumprelu

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

MigrationBuilder

-0.47

CppMethod

-0.47

 dinámico

-0.46

road

-0.46

Tikang

-0.45

PYX

-0.45

فیلم

-0.44

 Road

-0.43

iastes

-0.42

 Roads

-0.42

POSITIVE LOGITS

 scenes

1.36

 Behind

1.29

Behind

1.26

 behind

1.25

behind

1.18

 BEHIND

1.17

scenes

1.15

 Scenes

1.05

 closed

1.03

Scenes

1.03

Activations Density 0.062%

references to being "behind" something, particularly in contexts implying secrecy or concealment

"behind" followed by a preposition/the

behind the scenes

No Comments

No Known Activations

references to being "behind" something, particularly in contexts implying secrecy or concealment

"behind" followed by a preposition/the

behind the scenes

No Comments

No Known Activations