INDEX

Explanations

instances of violence or aggressive actions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ught

-0.08

prav

-0.07

.createObject

-0.07

eed

-0.06

áº¥n

-0.06

fal

-0.06

 meis

-0.06

spo

-0.06

ÏĦÎ±Î¹

-0.06

íĻĺ

-0.06

POSITIVE LOGITS

 into

0.09

off

0.08

 away

0.08

into

0.07

ano

0.07

anos

0.07

aran

0.06

Geh

0.06

_into

0.06

 back

0.06

Activations Density 0.068%