INDEX

Explanations

instances of statements that include criticism or claims about authority and legitimacy

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

uzu

-0.07

Ð¾Ð±Ð°

-0.06

iros

-0.06

 hodnÄĽ

-0.06

 bazen

-0.06

rir

-0.06

oomla

-0.06

ternet

-0.06

asty

-0.06

enis

-0.06

POSITIVE LOGITS

 whatsoever

0.29

 WHATSOEVER

0.22

 whatever

0.21

whatever

0.18

 Whatever

0.17

Whatever

0.17

atever

0.14

 other

0.13

 except

0.12

soever

0.12

Activations Density 0.140%