INDEX

Explanations

words that indicate logical inconsistency or illogical situations

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

cz

-0.07

crollView

-0.07

ÄŁer

-0.07

CHASE

-0.07

.decorate

-0.06

hoff

-0.06

Ð²Ð¾Ð´

-0.06

utsch

-0.06

enderror

-0.06

stÃ¥

-0.06

POSITIVE LOGITS

logical

0.16

logic

0.10

icit

0.10

 logical

0.09

icits

0.09

log

0.08

lict

0.08

og

0.08

le

0.08

LOG

0.08

Activations Density 0.004%