INDEX

Explanations

given, treated, or trained

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

เสียหาย

0.76

 متاثر

0.74

 dépass

0.73

 alterar

0.70

зить

0.68

 Backend

0.67

 ترتیب

0.66

 disparu

0.66

ங்காய்

0.66

 Happened

0.65

POSITIVE LOGITS

 given

1.99

 allowed

1.63

 granted

1.62

 told

1.61

 treated

1.61

given

1.60

 assigned

1.58

 encouraged

1.57

 instructed

1.51

Given

1.50

Activations Density 0.265%