INDEX

Explanations

breach, fraud, shooting, invitation

The neuron strongly activates on language describing legal judgments and criminal charges (e.g., pleas, convictions, offences, sentences).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

Caching

-0.75

 KILL

-0.75

 Canton

-0.74

SCN

-0.74

enough

-0.72

 CONV

-0.72

Filing

-0.72

 rezept

-0.72

 احتمال

-0.72

zom

-0.71

POSITIVE LOGITS

akat

0.80

 armée

0.76

bes

0.74

 breach

0.74

 breaches

0.73

 harian

0.73

 liệu

0.71

Rel

0.70

 Breach

0.70

 couleurs

0.69

Activations Density 0.010%