INDEX

Explanations

preferred pronouns and preferences

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

0.84

א

0.83

ط

0.79

ท

0.78

ก

0.77

ב

0.77

ה

0.75

ห

0.75

т

0.73

POSITIVE LOGITS

 preferences

1.04

 préférences

1.04

 preference

1.01

 preferências

0.91

 Preference

0.89

preference

0.86

 preferencia

0.84

 предпоч

0.82

 Preferences

0.81

 prefer

0.79

Activations Density 0.104%