INDEX

Explanations

phrases related to health risks or medical concerns

Text segments that indicate or emphasize the severity of risks, negative health effects, or harmful consequences, particularly in health and safety contexts.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

[Double

-0.07

ÐĴÑĸÐ´

-0.07

erator

-0.07

yro

-0.06

plits

-0.06

åłĤ

-0.06

ednou

-0.06

 enthus

-0.06

plit

-0.06

ymce

-0.06

POSITIVE LOGITS

 medical

0.07

 effects

0.07

IMS

0.06

 serious

0.06

medical

0.06

 depending

0.06

 éĸ

0.06

Effects

0.06

 Alarm

0.06

Alarm

0.06

Activations Density 0.022%