INDEX

Explanations

express

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.7.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 manoe

-0.08

 Baltimore

-0.07

 bullet

-0.07

 burden

-0.07

 Fault

-0.07

 bait

-0.07

bat

-0.06

POSITIVE LOGITS

 expressed

0.13

 expresses

0.12

 expressing

0.12

 express

0.12

 expressive

0.11

 Express

0.11

 Expression

0.10

 expression

0.09

 выраж

0.09

 expressions

0.09

Activations Density 0.030%