INDEX

Explanations

providing harmful information or instructions

The neuron responds to content (action) verbs – the main non-auxiliary verbs in a sentence.

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 सभी

0.78

 tüm

0.77

所有

0.75

Numero

0.70

 všech

0.70

their

0.69

すべての

0.69

 unsurpassed

0.69

0.66

 wszyscy

0.66

POSITIVE LOGITS

 एखा

0.88

 eines

0.83

 sebuah

0.77

0.73

 suatu

0.73

 একটা

0.72

 isang

0.71

een

0.70

 একটি

0.70

isches

0.68

Activations Density 0.606%