INDEX

Explanations

ruin, sabotage, undermine, derail

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

よりは

-1.17

だけではなく

-1.09

 nouvelles

-1.08

 différents

-1.06

 différentes

-1.05

 allerlei

-1.05

 distintos

-1.03

 lämp

-1.02

Cura

-1.02

 verschillende

-1.01

POSITIVE LOGITS

 attempts

2.25

 efforts

2.23

 already

2.09

 ability

1.99

せっかく

1.90

 attempt

1.82

 valuable

1.81

 intended

1.80

 carefully

1.77

 delicate

1.72

Activations Density 0.092%