INDEX

Explanations

I'm sorry, apologize

The neuron detects apology expressions—particularly occurrences of “sorry” (and related apology phrases).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

Сейчас

0.88

이라

0.84

 ನೋಡ

0.82

ぁ

0.82

やっと

0.80

이니까

0.79

कर्मा

0.77

 месяца

0.75

 рассказыва

0.74

𝙷

0.74

POSITIVE LOGITS

vad

0.85

vaj

0.79

 Offered

0.79

rieben

0.78

vl

0.77

vat

0.75

adati

0.75

󠁬

0.75

 Moreover

0.74

iladi

0.74

Activations Density 0.012%