INDEX

Explanations

harassment, intensity

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 harassing

-0.62

 harass

-0.61

 harassment

-0.52

 harassed

-0.50

vania

-0.47

izability

-0.47

pread

-0.45

Rows

-0.44

fixing

-0.43

fixes

-0.43

POSITIVE LOGITS

AndEndTag

0.75

 CreateTagHelper

0.74

 JpaRepository

0.70

WriteTagHelper

0.65

setopt

0.64

SourceChecksum

0.64

EndContext

0.62

apunov

0.62

 Roskov

0.61

 autorytatywna

0.59

Activations Density 0.106%