INDEX

Explanations

photograph, frame, root, however, angry

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

excluding

0.49

steering

0.46

 სან

0.45

representation

0.44

ਹ

0.44

아

0.44

 даты

0.43

 블루

0.42

ブルー

0.42

मलव

0.42

POSITIVE LOGITS

 blog

0.44

 trest

0.42

 individual

0.41

 ер

0.38

били

0.38

 fallen

0.38

 ground

0.37

 fond

0.37

藜

0.37

Dit

0.36

Activations Density 0.001%