INDEX

Explanations

on followed by a word

The neuron is essentially a single‐token detector that fires strongly whenever it encounters the preposition “on” (in either case).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

्स

2.53

tól

2.02

1.97

ನಲ್ಲಿ

1.95

నూ

1.93

有个

1.89

ের

1.80

ሳሪያ

1.77

有個

1.76

POSITIVE LOGITS

 وعلى

4.00

 behalf

3.70

 horseback

1.92

 základě

1.92

 steroids

1.90

 podstawie

1.71

िक्रमा

1.70

eday

1.70

ри

1.68

 ভিত্তি

1.66

Activations Density 2.744%