INDEX

Explanations

restriction

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Rules

-0.90

 RULES

-0.88

 Restriction

-0.86

 hearing

-0.85

 restriction

-0.82

 regole

-0.80

 Legislation

-0.79

Restriction

-0.79

 rules

-0.79

 restri

-0.79

POSITIVE LOGITS

mith

0.61

ed

0.59

ly

0.55

 instrument

0.52

of

0.51

 force

0.51

ecd

0.51

ificantly

0.50

for

0.48

us

0.48

Activations Density 0.119%