INDEX

Explanations

Increasing/decreasing trends

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

хьтан

-0.91

 increasing

-0.83

 Increasing

-0.81

Increasing

-0.80

GEBURTSDATUM

-0.78

 increased

-0.77

LookAnd

-0.76

ArgumentParser

-0.76

 increase

-0.76

increased

-0.76

POSITIVE LOGITS

ly

0.40

 niyang

0.38

什么呢

0.36

 wort

0.35

vue

0.34

 siyang

0.33

 interacted

0.33

прият

0.33

ynka

0.33

 cade

0.33

Activations Density 0.003%