INDEX

Explanations

insults and derogatory terms

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

оргаш

-2.63

 céramique

-2.58

 nových

-2.55



-2.52

蕻

-2.52

宬

-2.50

婠

-2.48

ᾥ

-2.48

嘡

-2.48

 tanques

-2.47

POSITIVE LOGITS

↵

4.06

”,

3.67

 Roughly

2.83

’

2.77

 Assuming

2.70

 Importantly

2.67

 Equally

2.64

 Using

2.59

2.56

 Ideally

2.53

Activations Density 0.002%