INDEX

Explanations

severe

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 severe

-1.67

severe

-1.52

 Severe

-1.50

Severe

-1.47

 under

-1.29

 severity

-1.27

 severely

-1.23

 sév

-1.16

 sever

-1.08

under

-1.02

POSITIVE LOGITS

oredCriteria

0.77

RegressionTest

0.58

вік

0.58

 expectations

0.57

ItemBackground

0.56

 enough

0.55

 theirs

0.54

 antecedents

0.54

 terms

0.53

السكان

0.53

Activations Density 0.071%