INDEX

Explanations

time prepositions

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_15/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.15.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Conditions

-0.07

 recalling

-0.07

�n

-0.06

mens

-0.06

pet

-0.06

 recur

-0.06

weit

-0.06

Sequential

-0.06

 intellectuals

-0.06

wo

-0.06

POSITIVE LOGITS

_rewards

0.07

 jihad

0.07

ixer

0.06

 unanim

0.06

 Triple

0.06

 Thornton

0.06

oom

0.06

 admon

0.06

_LENGTH

0.06

 мист

0.06

Activations Density 0.019%

time prepositions

No Comments

No Known Activations