INDEX

Explanations

direct praise and compliments

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

=-\

0.54

=-

0.53

}=-\

0.50

}}=-

0.50

}=-

0.50

)=-

0.47

=-\

0.47

 frightening

0.46

=-

0.46

]=-

0.46

POSITIVE LOGITS

 praise

2.06

 praising

2.05

 admiration

1.98

 praises

1.91

 प्रशंसा

1.88

 admiring

1.83

 প্রশংসা

1.80

 तारीफ

1.76

 praised

1.73

 admires

1.73

Activations Density 0.033%