INDEX

Explanations

automatic actions and detection

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

affirming

0.33

 elucidation

0.32

闡

0.32

 ندار

0.32

 dobbiamo

0.32

ٰ

0.32

 имели

0.31

ೀರಿ

0.30

鬣

0.30

 ছিলেন

0.30

POSITIVE LOGITS

 automatically

1.28

 автоматически

1.11

自动

1.01

automatically

1.00

 detects

0.98

 자동으로

0.97

 automáticamente

0.96

 Automatically

0.96

 automatiquement

0.96

Automatically

0.95

Activations Density 0.235%