INDEX

Explanations

sensitive topics like exploitation

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 сыгра

0.40

enz

0.37

Division

0.36

InputChange

0.36

﹀

0.36

ণিজ

0.35

supporting

0.35

Changes

0.34

rások

0.34

ref

0.34

POSITIVE LOGITS

忙

0.57

 delicate

0.55

 fragile

0.53

 crowded

0.52

 busy

0.52

 sensitive

0.51

 sedang

0.50

 ऑलरेडी

0.50

 hectic

0.49

 unsuspecting

0.49

Activations Density 0.157%