INDEX

Explanations

harmful or dangerous consequences

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

st

1.39

non

1.28

turtle

1.21

law

1.18

dri

1.17

1.15

turbine

1.13

र

1.12

même

1.11

POSITIVE LOGITS

 kinks

1.30

 помощью

1.25

вые

1.20

 военной

1.20

刋

1.19

 differenza

1.18

সঙ্ঘ

1.18

 deviations

1.18

Β

1.18

 historial

1.17

Activations Density 0.065%