INDEX

Explanations

phrases indicating intentions or aspirations

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

Juliushanhanhan/llama-3-8b-it-res/blocks.25.hook_resid_post

Features

65,536

Data Type

float32

Hook Name

blocks.25.hook_resid_post

Hook Layer

Architecture

gated

Context Size

1,024

Dataset

Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

aterno

-0.16

udden

-0.16

Ã±as

-0.16

 itself

-0.14

ado

-0.14

 promise

-0.14

 Plan

-0.14

ayne

-0.13

.instant

-0.13

emez

-0.13

POSITIVE LOGITS

 eventual

0.28

 eventually

0.26

 soon

0.25

soon

0.25

 Eventually

0.21

 Soon

0.20

Soon

0.20

Eventually

0.19

æľīä¸Ģ

0.18

use

0.18

Activations Density 0.072%

phrases indicating intentions or aspirations

No Comments

No Known Activations