INDEX

Explanations

summarization followed by punctuation

The neuron detects salient content-carrying words — important task/topic nouns and verbs (i.e., semantically informative tokens).

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 tenang

0.24

 existem

0.22

 stratégie

0.22

 déplacer

0.22

 théorie

0.22

 असून

0.21

 utilisés

0.21

 demasi

0.21

 bruge

0.20

 psychiat

0.20

POSITIVE LOGITS

."

0.26

.")

0.26

.`

0.25

.”

0.25

。”

0.25

_.

0.24

."""

0.24

↵↵↵↵↵↵↵↵↵↵↵

0.24

".

0.24

。【

0.24

Activations Density 2.415%