INDEX

Explanations

woman words

references to female individuals, especially when the subject is a woman or girl and is referred to with feminine pronouns or names.

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 способен

0.57

 сам

0.56

 sám

0.56

Escolhido

0.55

 आला

0.54

 равен

0.52

 نفسه

0.52

 который

0.51

 должен

0.50

 शकतो

0.50

POSITIVE LOGITS

 herself

1.44

 woman

1.05

 actresses

1.02

 girl

1.01

 businesswoman

1.01

 heroine

1.00

 women

0.97

 xinh

0.97

 femenina

0.97

 نفسها

0.96

Activations Density 0.431%