INDEX

Explanations

the neuron lights up on salient content words — especially named entities, dates/numbers, and topic-specific keywords (important nouns/terms).

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

0.56

ंप

0.52

 Mens

0.49

 دور

0.47

 Misc

0.46

 Madness

0.45

त्मक

0.44

 funktion

0.44

 Measures

0.44

طلع

0.44

POSITIVE LOGITS

নি

0.54

}}

0.53

یسم

0.48

 sudut

0.48

شي

0.47

tı

0.46

өлү

0.46

 држа

0.45

 warrantless

0.45

سه

0.45

Activations Density 1.041%