INDEX

Explanations

fractions or divisions

The neuron fires on tokens that signal instructional or advisory language—i.e. the “to”-infinitive and other markers of recommendations and how-to directions.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

new

0.68

ur

0.66

0.65

0.63

0.62

ウ

0.62

https

0.61

雨

0.61

pppp

0.60

川

0.59

POSITIVE LOGITS

्लाई

0.62

িকল্প

0.61

 ничек

0.61

DBGPRINT

0.60

 Cumm

0.60

 Moser

0.60

 روی

0.59

䢌

0.59

 এক্ষেত্রে

0.59

 decompression

0.59

Activations Density 0.032%