INDEX

Explanations

I think

The neuron detects first‐person opinion or hedging phrases (e.g. “I think,” “I believe,” expressions of personal judgment).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

ли

1.32

 어떻게

0.96

τητα

0.95

如何在

0.91

lı

0.91

如何

0.90

і

0.90

com

0.89

how

0.89

ien

0.89

POSITIVE LOGITS

 людьми

0.98

 grot

0.91

rated

0.90

 bijna

0.90

 gasp

0.90

 faceted

0.89

 unusually

0.88

ف

0.88

 faucibus

0.88

 devemos

0.88

Activations Density 0.040%