INDEX

Explanations

events and revealing

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 exposed

-1.10

 exposure

-0.81

 Exposed

-0.80

 raised

-0.74

exp

-0.73

 expos

-0.73

 EXPOSURE

-0.71

 Exposure

-0.70

exposed

-0.69

 expose

-0.69

POSITIVE LOGITS

 policiales

0.62

 hebdomada

0.59

 genoux

0.59

 revanche

0.59

 Grecs

0.58

 enfans

0.58

WriteTagHelper

0.57

 démocr

0.56

 Conſ

0.56

openqa

0.56

Activations Density 0.114%