INDEX

Explanations

verbs indicating attempts or actions toward a goal

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

Juliushanhanhan/llama-3-8b-it-res/blocks.25.hook_resid_post

Features

65,536

Data Type

float32

Hook Name

blocks.25.hook_resid_post

Hook Layer

Architecture

gated

Context Size

1,024

Dataset

Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

itize

-0.19

iated

-0.18

italize

-0.17

IZED

-0.16

ISED

-0.16

ILER

-0.16

hausen

-0.16

pone

-0.16

urator

-0.16

ekler

-0.15

POSITIVE LOGITS

ings

0.71

ing

0.63

ng

0.62

Ing

0.61

ÂŃing

0.58

INGS

0.51

-ing

0.49

ning

0.47

Ing

0.45

ining

0.43

Activations Density 0.051%

verbs indicating attempts or actions toward a goal

No Comments

No Known Activations