INDEX

Explanations

references to individuals held in detention

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 jail

-1.13

 prison

-0.90

 Jail

-0.82

LLocation

-0.80

 ſte

-0.69

jail

-0.68

WarningLevel

-0.66

 ouvido

-0.65

CppMethod

-0.64

Jail

-0.64

POSITIVE LOGITS

 prisoner

2.36

 prisoners

2.27

 Prisoners

1.92

 Prisoner

1.88

 prisonniers

1.07

 captives

0.97

prison

0.95

 prision

0.95

 Gefang

0.82

 captive

0.74

Activations Density 0.007%