INDEX

Explanations

information related to animals, animal rights, and animal welfare

oai_token-act-pair · gpt-3.5-turbo Triggered by @bot

New Auto-Interp

Configuration

jbloom/Gemma-2b-IT-Residual-Stream-SAEs/gemma_2b_it_blocks.12.hook_resid_post_16384

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

HuggingFaceFW/fineweb

Features

16,384

Data Type

float32

Hook Name

blocks.12.hook_resid_post

Hook Layer

Architecture

standard

Context Size

1,024

Dataset

Skylion007/openwebtext

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 rendono

-0.49

 moreno

-0.49

cupa

-0.49

covite

-0.44

ícil

-0.44

 habet

-0.43

Życiorys

-0.43

 SEDS

-0.43

prada

-0.41

ureka

-0.41

POSITIVE LOGITS

 animal

1.19

 animals

1.11

animal

1.11

 Animal

1.07

Animal

1.06

 Animals

1.03

 ANIMAL

0.97

animals

0.96

Animals

0.96

ANIMAL

0.89

Activations Density 0.095%

information related to animals, animal rights, and animal welfare

No Comments

No Known Activations