INDEX

Explanations

toxicity/exacerb

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 exaggeration

-0.78

 exaggerate

-0.73

 exagger

-0.66

 exag

-0.61

 exager

-0.60

 exaggerated

-0.59

Spoljašnje

-0.57

CodedInputStream

-0.57

 exaggerating

-0.55

 Waray

-0.54

POSITIVE LOGITS

 חיצוניים

0.71

rungsseite

0.66

 ujednoznacz

0.63

tagHelperRunner

0.62

:✨

0.59

Gön

0.53

Personensuche

0.52

 فريبيس

0.52

 ProtoMessage

0.51

 AssemblyCulture

0.50

Activations Density 0.031%