INDEX

Explanations

control, manipulation, or exploitation

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 നിങ്ങളുടെ

0.50

(,

0.48

(^

0.46

 בנו

0.45

>+</

0.45

 меньше

0.44

 பள்ள

0.44

 স্থায়ী

0.43

(§

0.43

tx

0.43

POSITIVE LOGITS

先

0.49

CrossOrigin

0.45

 anger

0.44

 Scripture

0.44

妬

0.43

Liên

0.43

ุงเทพ

0.43

Ме

0.43

梭

0.42

 Memory

0.42

Activations Density 0.001%