INDEX

Explanations

роф. I am choosing this because there is a repetition of this token in MAX_ACTIVATING_TOKENS

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 charms

-0.09

 cardstock

-0.08

 charm

-0.08

 codecs

-0.08

 volt

-0.08

 Libert

-0.08

 χει

-0.08

zak

-0.07

 airborne

-0.07

 ballistic

-0.07

POSITIVE LOGITS

very

0.07

 compartment

0.07

-Val

0.07

healthy

0.07

ef

0.07

esp

0.07

don

0.07

quinas

0.07

.Full

0.07

Activations Density 0.001%