INDEX

Explanations

The examples show a mix of formal language, technical terms, and specific entities like names and locations.However, let's re-examine the `MAX_ACTIVATING_TOKENS` and `TOKENS_AFTER_MAX_ACTIVATING_TOKEN` more closely as they are often the direct clues for simple patterns.- `should` -> `choose`- `Aff` -> `irm` (suggests `affirm/confirm` perhaps, or `affirmative`)- `arn` -> `ed` (suggests `earn/learned`, `earned/learned`)- `rev` -> `ised` (suggests `revise/revised`)- `with` -> `hung` (this one does not fit the suffix pattern)- `hung` -> `suspended` (this is a strong co-occurrence)- `Одна` (Russian for "one") -> `жды` (Russian for "times" or "occurrences") -> This together means "once" or "one time", suggesting frequency.- `station` -> `**` (this isn't a token, likely a formatting artifact.affirmed, revised, earned, suspended

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

0.47

 ridiculous

0.44

通道

0.43

asure

0.42

$)

0.41

 scre

0.41

 warmup

0.40

ฏิ

0.40

 lenient

0.39

ure

0.39

POSITIVE LOGITS

DAVID

0.50

</h4>

0.49

 diversas

0.49

であった

0.49

 évek

0.48

 Estadística

0.48

 amplio

0.47

 diverso

0.47

 まず

0.47

にお

0.47

Activations Density 0.000%