INDEX

Explanations

disclaimers

The neuron fires on health and safety–related instructions or warnings (e.g. “do not consume…,” “store out of direct sunlight,” “stop use immediately”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 histologic

0.71

 geliyor

0.69

 five

0.69

కుంది

0.69

 geniş

0.68

 galore

0.68

वतात

0.68

 ERISA

0.68

 exclusivos

0.67

 meyd

0.67

POSITIVE LOGITS

ご注意

0.92

Please

0.85

कृपया

0.80

 Please

0.79

Kindly

0.79

ご了承ください

0.79

Você

0.77

 please

0.75

Avoid

0.74

 कृपया

0.74

Activations Density 0.030%