INDEX

Explanations

profane or foul language

The neuron flags mentions of the use or investigation of offensive or profane words (e.g. slurs, expletives).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 parallax

-0.77

 yearly

-0.75

ウェイ

-0.75

othermic

-0.74

זי

-0.73

illères

-0.73

氫

-0.71

 Wink

-0.71

Stateful

-0.71

teig

-0.71

POSITIVE LOGITS

 swearing

3.66

 swear

3.63

 swears

2.98

 curse

2.72

 cursing

2.64

 swore

2.56

 profane

2.19

 curses

2.19

 prof

2.14

curse

2.09

Activations Density 0.039%