INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

컨

0.52

ऱ्या

0.49

下

0.48

伙

0.48

 Floors

0.47

as

0.46

िक्की

0.46

ifères

0.45

၂

0.44

 Testing

0.44

POSITIVE LOGITS

 attitudes

0.51

 attitude

0.51

attitude

0.50

='"

0.45

 enlightened

0.44

 obsess

0.43

 enlighten

0.42

 actitudes

0.42

 enlightenment

0.42

 indoctr

0.41

Activations Density 0.000%

No Known Activations