INDEX

Explanations

perpetrator, attacker, culprit, assailant, thief

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

0.77

ー

0.75

ER

0.74

0.73

0.71

н

0.71

بر

0.71

ন

0.70

0.69

POSITIVE LOGITS

 perpetrators

0.74

 perpetrator

0.70

 assail

0.70

 culprits

0.68

 attackers

0.66

 assailant

0.62

 burgl

0.62

 athletic

0.61

 vant

0.61

 thieves

0.61

Activations Density 0.022%