INDEX

Explanations

dive

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 bezeichneter

-1.12

 Diversion

-0.95

 Diving

-0.91

 eviction

-0.85

 dividends

-0.84

 שוליים

-0.84

 Divided

-0.82

 Divisional

-0.81

glGen

-0.81

 rewarded

-0.80

POSITIVE LOGITS

set

0.56

 using

0.53

</b>

0.51

in

0.50

 under

0.49

 with

0.48

or

0.47

dero

0.46

 trang

0.45

Activations Density 0.203%