INDEX

Explanations

female names and roles

This neuron detects personal names—capitalized named entities referring to people.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 handsome

0.35

 সুদর্শন

0.33

david

0.32

 boyhood

0.31

 Himself

0.30

lar

0.30

👔

0.29

 நண்ப

0.29

бав

0.28

POSITIVE LOGITS

 कुमारी

0.55

 Kumari

0.50

 actresses

0.49

 policewomen

0.48

 Louise

0.45

 heroine

0.44

 Elizabeth

0.44

 خاتون

0.44

 actress

0.43

 María

0.43

Activations Density 0.028%