INDEX

Explanations

action verbs

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

featureID

-0.84

 nahilalakip

-0.59

styleType

-0.56

ỏng

-0.51

the

-0.50

 conmigo

-0.50

 MEANS

-0.50

knapp

-0.50

usky

-0.50

ỡng

-0.49

POSITIVE LOGITS

 which

0.57

 where

0.50

AndroidJUnit

0.50

IsContent

0.49

 wohin

0.49

bitField

0.47

 kubwa

0.46

inhua

0.45

UserRepository

0.45

Activations Density 0.003%