INDEX

Explanations

realistic expectations

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

perception

0.94

والفق

0.90

arence

0.88

 discernible

0.84

jective

0.84

 signified

0.82

 мнение

0.81

 längre

0.80

赢

0.80

 detectable

0.80

POSITIVE LOGITS

 raised

0.77

cas

0.76

 hija

0.76

 resetting

0.73

ل

0.71

 Expectations

0.70

 चूर

0.69

สูง

0.69

 lowered

0.68

Kya

0.68

Activations Density 0.010%