INDEX

Explanations

when you're likely to

The neuron fires strongly on direct audience address—especially second-person pronouns like “you” (and related “we”/“us” constructions).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 بش

0.62

 mishaps

0.59

 sloppy

0.59

 badly

0.59

 thrives

0.59

 inept

0.59

 blames

0.56

 flourishes

0.56

 sporad

0.56

 flaws

0.55

POSITIVE LOGITS

essentially

1.08

 essentially

1.02

Essentially

1.02

 indirectly

0.96

 Essentially

0.95

basically

0.95

creating

0.89

 menciptakan

0.88

Creating

0.88

 Creating

0.87

Activations Density 0.457%