INDEX

Explanations

it's fair or safe to say

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 menik

-0.91

 häng

-0.88

лта

-0.78

cially

-0.76

 quartier

-0.74

afternoon

-0.73

 château

-0.72

;;)

-0.72

++.

-0.71

ilets

-0.70

POSITIVE LOGITS

 fair

4.25

fair

2.70

 safe

2.58

Fair

2.55

 Fair

2.34

 fairs

2.25

FAIR

2.19

 FAIR

2.14

 fairness

1.97

 accurate

1.70

Activations Density 0.030%