INDEX

Explanations

The neuron detects mathematical equations and expressions

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

磪

0.62

闻

0.61

 passionately

0.61

 disapproval

0.60

 সাধারণত

0.58

oughby

0.57

থায়

0.57

isn

0.56

 blir

0.56

聞

0.56

POSITIVE LOGITS

two

1.09

two

1.02

TWO

1.02

three

1.00

 three

0.98

 zwei

0.98

兩個

0.98

 dwóch

0.98

 four

0.96

 THREE

0.96

Activations Density 0.404%