INDEX

Explanations

confirming details before action

The neuron fires on directive or checklist language in instructional or warning texts—terms like “review,” “verify,” “ensure,” “follow,” etc., that tell the reader to check or confirm something.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

Karakter

-0.94

iveau

-0.91

Klas

-0.89

 benzin

-0.87

ebenarnya

-0.85

 attempts

-0.85

ჩ

-0.85

 nabo

-0.81

ulitan

-0.80

 Argentino

-0.79

POSITIVE LOGITS

 before

1.20

确认

1.03

 sebelum

0.99

决定

0.90

IZONA

0.84

 soprattutto

0.84

 notamment

0.84

final

0.82

 verkligen

0.82

 confirm

0.82

Activations Density 0.025%