INDEX

Explanations

profoundly/deeply + negative adjective

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

is

2.20

ن

1.94

ли

1.71

are

1.45

id

1.41

و

1.39

ally

1.34

ion

1.31

هل

1.30

ie

1.29

POSITIVE LOGITS

atthena

1.56

통산

1.54

acchati

1.51

 niềm

1.46

 errMsg

1.43

Aloe

1.41

❤️❤️

1.40

邑

1.40

dür

1.38

 profondément

1.38

Activations Density 0.235%