INDEX

Explanations

<?xml and code

The main thing this neuron does is detect positive evaluative words expressing praise or admiration.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 ilgi

-1.12

ର

-1.08

其余

-1.08

Nachteile

-1.06

 mengakibatkan

-1.05

 klachten

-1.05

aret

-1.04

seamnă

-1.03

修为

-1.02

 persönliche

-1.02

POSITIVE LOGITS

of

1.21

 such

1.15

 vandens

1.06

 especially

1.05

kých

0.97

 établissements

0.94

:],

0.94

 studies

0.94

Blog

0.93

mybatisplus

0.91

Activations Density 0.011%