INDEX

Explanations

code and measurements

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

üst

0.82

jacobian

0.81

觑

0.80

fabs

0.79

cede

0.78

 തിരഞ്ഞെടു

0.78

ülő

0.78

 ziff

0.77

printf

0.77

 escolha

0.76

POSITIVE LOGITS

&&

0.85

===

0.79

||

0.79

!==

0.75

 Brain

0.72

&&(

0.72

?'

0.71

&&

0.71

Gaz

0.69

đa

0.68

Activations Density 0.081%