INDEX

Explanations

explanations or contexts

This neuron fires strongly on tokens within enthusiastic or promotional praise—e.g. words in exclamatory, complimentary, or “I would love/try” style positive-recommendation contexts.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

idated

0.74

 Этот

0.71

 MyRegisterClass

0.70

 Kantor

0.69

 Kunststoff

0.68

篾

0.68

 Questo

0.68

ärg

0.67

 Corrosion

0.67

 Reprodu

0.66

POSITIVE LOGITS

 cruel

0.81

ف

0.80

ل

0.80

ઝ

0.76

 obligations

0.74

گ

0.74

 einfach

0.73

ن

0.73

 mindless

0.71

 headwinds

0.71

Activations Density 0.110%