INDEX

Explanations

obliging to requests

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 breaking

0.91

 prevent

0.75

 preventive

0.74

 broken

0.74

 prevention

0.74

 active

0.73

 మాట్లాడు

0.72

あ

0.72

对了

0.70

 personal

0.70

POSITIVE LOGITS

 obliged

1.54

 complied

1.48

 oblige

1.42

 comply

1.34

 complies

1.24

 obeyed

1.21

 complying

1.20

 dutiful

1.16

 hesitated

1.15

 obedient

1.15

Activations Density 0.096%