INDEX

Explanations

hate

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 hate

-1.25

 hates

-1.16

hate

-1.09

 hating

-1.05

 Hate

-1.02

Hate

-1.02

 hated

-1.00

 HATE

-0.94

 dislikes

-0.93

 dislike

-0.93

POSITIVE LOGITS

 ویکی‌پدیا

0.66

 Roskov

0.63

Története

0.59

 وتسجيلات

0.57

 termica

0.56

脚注の使い方

0.55

 sagesse

0.54

AsUp

0.54

 nahilalakip

0.54

 numéros

0.54

Activations Density 0.014%