INDEX

Explanations

affirmations and refusals

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Chú

2.32

ᡳ

2.30

拉

2.28

ᡠ

2.26

ी

2.24

जेट

2.24

ะ

2.20

હિતી

2.20

미

2.16

ళ్

2.15

POSITIVE LOGITS

ed

1.97

ة

1.96

ing

1.79

ψε

1.75

ypen

1.72

acek

1.72

uken

1.71

če

1.65

ت

1.64

Й

1.63

Activations Density 0.009%