INDEX

Explanations

speaking or actions causing effects

New Auto-Interp

Configuration

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

tive

1.43

 stripper

1.18

 lingering

1.17

 irregularities

1.16

 slight

1.16

 raping

1.14

ᆨ

1.13

 pissed

1.11

 depressed

1.09

 drunk

1.08

POSITIVE LOGITS

 dúvida

1.05

 Verificar

1.05

ю

1.04

라

1.01

 країн

0.99

行って

0.99

uparavant

0.98

лки

0.98

 porówn

0.95

лке

0.95

Activations Density 0.000%