INDEX

Explanations

risks pitfalls dangers

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ınca

0.46

에서의

0.46

Opportunity

0.46

ayena

0.44

not

0.44

not

0.43

ได้

0.42

ǚ

0.42

记

0.41

∉

0.41

POSITIVE LOGITS

 tracts

0.43

 ills

0.42

 pitfalls

0.41

 overhaul

0.41

 risks

0.39

 oran

0.39

 risques

0.39

 तब्

0.39

 peligros

0.38

 ścian

0.37

Activations Density 0.018%