INDEX

Explanations

took

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 taken

-1.71

taken

-1.50

 Taken

-1.39

 TAKEN

-1.37

Taken

-1.34

 took

-1.18

 tomada

-0.98

took

-0.98

 diambil

-0.98

 Took

-0.95

POSITIVE LOGITS

 advantage

0.69

off

0.66

 place

0.59

 particular

0.56

out

0.56

ed

0.54

 nearly

0.54

Източници

0.52

ths

0.52

obacteria

0.52

Activations Density 0.039%