INDEX

Explanations

describing physical attributes

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ﻒ

0.54

Categ

0.52

Damage

0.51

क्ष्य

0.47

なく

0.47

砾

0.47

 layoff

0.46

ूरिया

0.46

ना

0.46

즙

0.46

POSITIVE LOGITS

 conquered

0.41

 comercio

0.40

 conquer

0.39

han

0.39

 authoritarian

0.39

 supremo

0.39

 dirigeants

0.38

 encargado

0.38

 totalitarian

0.37

 متنوع

0.37

Activations Density 0.002%