INDEX

Explanations

cardinal numbersRationale:The `TOP_POSITIVE_LOGITS` list is dominated by cardinal numbers (thirteen, seventeen, eighteen, nineteen, eighty, fourteen, twelve). While other tokens in `MAX_ACTIVATING_TOKENS` and `TOP_ACTIVATING_TEXTS` relate to travel, books, and specific names, the overwhelming signal in the logits points towards numerical concepts. The phrase "cardinal numbers" is concise and accurately reflects this primary signal

pronoun + verb

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

倘

0.66

當中

0.63

Asimismo

0.59

 notamment

0.57

Comme

0.57

 лишь

0.57

няка

0.56

👌

0.56

แม้

0.55

 désormais

0.54

POSITIVE LOGITS

 thirteen

0.63

 seventeen

0.61

 eighteen

0.60

 nineteen

0.60

 zuerst

0.58

 dört

0.58

 eighty

0.58

 fourteen

0.57

 இரண்டு

0.55

 twelve

0.52

Activations Density 0.041%