INDEX

Explanations

unethical practices and behavior

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

UW

1.96

1.92

{'

1.90

Emo

1.86

1.82

1.79

cross

1.76

1.73

Commander

1.73

AA

1.71

POSITIVE LOGITS

ქმედ

3.00

 coveredmethods

2.76

 meadows

2.71

ेक्षित

2.71

ি

2.69

勰

2.60

徭

2.55

atsion

2.55

ی

2.53

্থিত

2.51

Activations Density 0.001%