INDEX

Explanations

considered

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 considered

-1.98

 Considered

-1.78

considered

-1.75

 considerados

-1.42

 considérée

-1.42

 considéré

-1.41

 considerado

-1.41

 consideradas

-1.38

 regarded

-1.31

 considerada

-1.31

POSITIVE LOGITS

to

0.75

by

0.75

its

0.64

the

0.62

0.61

 Wilk

0.60

 abusive

0.57

it

0.57

as

0.57

 lawful

0.56

Activations Density 0.044%