INDEX

Explanations

refusal, preference, offensive

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

first

2.42

fier

2.22

و

2.20

fib

2.17

 ทำ

2.14

fishing

2.14

fälle

2.14

field

2.13

やつ

2.09

뻗

2.09

POSITIVE LOGITS

तौर

2.31

 Фи

2.30

要是

2.20

Фе

2.20

Фи

2.17

 substantiate

2.16

 Fren

2.14

 Finances

2.06

 Фа

2.03

 Efect

2.03

Activations Density 1.224%