INDEX

Explanations

causality reasons explanations

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

쭉

0.45

 সুতরাং

0.43

 ***!

0.40

 اپنا

0.40

 mangiare

0.39

 অতএব

0.38

 테스트

0.38

 ফাইল

0.38

ARCHIVO

0.38

!!!!!!!

0.38

POSITIVE LOGITS

 because

0.52

because

0.51

 Karena

0.51

 Ведь

0.50

 যেহেতু

0.48

 نیز

0.47

Because

0.47

 ponieważ

0.46

porque

0.46

 Porque

0.46

Activations Density 0.188%