INDEX

Explanations

references to violent or aggressive behavior

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

rrggbb

-0.73

RenderAtEndOf

-0.69

 sectional

-0.66

setScene

-0.66

/**

-0.63

 métiers

-0.61

APIs

-0.60

ween

-0.59

batore

-0.59

 Sche

-0.59

POSITIVE LOGITS

 interferon

1.34

 violent

1.12

violent

0.91

Violent

0.86

 Violent

0.85

 violently

0.76

 Monfieur

0.76

 pleaſure

0.73

 houſe

0.71

Efq

0.71

Activations Density 0.002%