INDEX

Explanations

occurrences of authority or hierarchical structures and their actions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ø§Ø±Ø¬

-0.07

Ø¬Ø§Ø±

-0.07

istrovstvÃŃ

-0.07

lÃŃÄį

-0.07

Ø§Ø³ÛĮ

-0.07

xies

-0.07

itchens

-0.06

yen

-0.06

_irq

-0.06

 frau

-0.06

POSITIVE LOGITS

 altogether

0.10

 khá»ıi

0.08

iform

0.07

andin

0.07

ertos

0.06

 grips

0.06

 anymore

0.06

 consideration

0.06

fer

0.06

ä»ķ

0.06

Activations Density 0.035%