INDEX

Explanations

negative feelings and states

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 σημ

1.09

𝑜

1.08

 freck

1.02

𝑝

1.00

我覺得

1.00

فق

0.98

 вс

0.98

 gleaming

0.97

signale

0.96

𝑒

0.96

POSITIVE LOGITS

 compelled

1.71

 obligated

1.48

 inadequacy

1.39

 obliged

1.35

 obligation

1.29

 nevoia

1.29

 betrayed

1.28

 cheated

1.28

 sorry

1.26

 entitlement

1.24

Activations Density 0.221%