INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Charlotte

-0.07

 יל

-0.07

氢

-0.07

 Robot

-0.07

 strange

-0.07

流行

-0.07

Wor

-0.06

 Morm

-0.06

fig

-0.06

pct

-0.06

POSITIVE LOGITS

 access

0.09

 Accessed

0.08

osphere

0.07

 nakne

0.07

 Access

0.07

 accessing

0.07

✇

0.07

.Access

0.07

 accessed

0.07

Activations Density 0.049%

No Known Activations