INDEX

Explanations

generating sexually suggestive

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 therefore

0.65

 that

0.61

أ

0.54

<code>

0.53

 però

0.53

ம்

0.52

ı

0.52

о

0.51

 accordingly

0.50

faz

0.50

POSITIVE LOGITS

<unused1781>

1.36

 geopolitical

1.32

 sexual

1.31

 기타

1.28

<unused162>

1.28

<unused399>

1.28

<unused1855>

1.28

<unused287>

1.26

<unused516>

1.25

𒐤

1.25

Activations Density 2.739%