INDEX

Explanations

references to gender inequalities and societal expectations

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

icari

-0.10

arget

-0.09

indir

-0.08

à¸£à¸¡

-0.08

aris

-0.08

allah

-0.08

ãĤĤãĤĬ

-0.08

aÃ§

-0.08

onse

-0.08

anzi

-0.07

POSITIVE LOGITS

 male

0.24

 males

0.20

 Male

0.18

male

0.18

çĶ·æĢ§

0.16

 masculine

0.16

Male

0.15

men

0.14

 Ð¼ÑĥÐ¶ÑĩÐ¸Ð½

0.14

 mascul

0.14

Activations Density 0.032%