INDEX

Explanations

primary intention or process

assistant/model-response segments within a chat transcript (i.e., content from the model’s turn rather than the user’s).

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 Neighbourhood

0.41

 thomas

0.40

 নিঃসন্দেহে

0.40

 pozycji

0.39

 THOMAS

0.39

 obstáculos

0.39

 Loksatta

0.39

 alltid

0.38

玹

0.38

 berlin

0.37

POSITIVE LOGITS

 Highlander

0.49

linien

0.47

 معين

0.46

પતિ

0.46

itul

0.44

нюю

0.43

ii

0.42

idas

0.42

يك

0.41

उच्च

0.41

Activations Density 15.583%