INDEX

Explanations

instances of negative sentiments or conditions

New Auto-Interp

Configuration

Features

65,536

Data Type

float32

Hook Name

blocks.25.hook_resid_post

Hook Layer

Architecture

gated

Context Size

1,024

Dataset

Juliushanhanhan/openwebtext-1b-llama3-tokenized-cxt-1024

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.16

-0.15

-0.14

 following

-0.14

-grow

-0.14

æĸ

-0.14

POSITIVE LOGITS

styleType

0.19

 Redistributions

0.18

webkit

0.18

=-=-=-=-=-=-=-=-

0.18

'gc

0.17

wahl

0.16

~-~-~-~-

0.16

 Ø¨ÙĪØ§Ø¨Ø©

0.15

ysz

0.15

Ø´ÙĨØ§Ø³ÛĮ

0.15

Activations Density 0.038%