INDEX

Explanations

terms related to offensive strategies and actions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

+offset

-0.09

oral

-0.08

nez

-0.08

Ð½ÐµÐ·

-0.07

oris

-0.07

dera

-0.07

ily

-0.07

oria

-0.06

ilk

-0.06

uckets

-0.06

POSITIVE LOGITS

ensively

0.08

anagan

0.08

/off

0.07

ifer

0.07

entlich

0.07

ive

0.07

/on

0.07

ally

0.07

/support

0.06

portun

0.06

Activations Density 0.009%