INDEX

Explanations

avoiding problems or negative outcomes

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ar

-1.98

’’

-1.98

-1.93

も良い

-1.84

 дополнительных

-1.77

en

-1.76

?’

-1.70

羕

-1.69

気持ちが

-1.68

暖かい

-1.68

POSITIVE LOGITS

媜

2.38

𞤢

2.23

࿙

2.08

皛

2.08



2.08

୩

2.02

tryck

1.98

閦

1.94

ープン

1.90

禵

1.88

Activations Density 0.020%