INDEX

Explanations

mentions of familial relationships, particularly involving husbands and partners

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

acters

-0.08

 Friend

-0.07

kke

-0.07

illa

-0.07

ÑĤÐ°Ð±

-0.07

Ø¬Ø§Ø¬

-0.06

opes

-0.06

ÐµÑģÐ°

-0.06

incinn

-0.06

nap

-0.06

POSITIVE LOGITS

 Bair

0.07

 Highlands

0.06

Ran

0.06

 Pier

0.06

 settled

0.06

end

0.05

 downs

0.05

 gradient

0.05

bio

0.05

Activations Density 0.007%