INDEX

Explanations

toxicity, datasets, which state, capable of, other restaurant, grief at, targeting Linux, database

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

רו

0.49

ﻟ

0.48

飴

0.44

 احنا

0.44

恵

0.43

után

0.42

ուն

0.41

 عرصے

0.41

рого

0.41

ल्यानंतर

0.40

POSITIVE LOGITS

ajn

0.46

ifs

0.45

}:

0.44

istors

0.44

 Waiver

0.44

 auraient

0.44

 violence

0.42

 squats

0.40

okens

0.40

Activations Density 0.000%