INDEX

Explanations

child sexual abuse material

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 behavioural

0.41

 behavioral

0.37

 pedibus

0.37

 साँ

0.37

 인식

0.36

 volgen

0.36

崛

0.36

 affirmation

0.36

 bör

0.35

俪

0.35

POSITIVE LOGITS

 induce

0.49

 materially

0.48

 arrest

0.47

 induces

0.45

 arrests

0.44

arrest

0.43

 inducing

0.43

 Arrest

0.43

 induced

0.43

šanas

0.42

Activations Density 0.003%