INDEX

Explanations

whether or not it is

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 neither

-0.16

 nÃ£o

-0.13

 tidak

-0.12

à¹Ħà¸¡

-0.12

 nicht

-0.12

 khÃ´ng

-0.12

 nemus

-0.11

 cannot

-0.11

 didn

-0.11

not

-0.11

POSITIVE LOGITS

 ever

0.26

 Ð²Ð¾Ð¾Ð±ÑīÐµ

0.20

 indeed

0.19

 vÅ¯bec

0.19

 EVER

0.17

 truly

0.17

 actually

0.17

 really

0.17

æľ¬å½ĵãģ«

0.16

 Ð´ÐµÐ¹ÑģÑĤÐ²Ð¸ÑĤÐµÐ»ÑĮÐ½Ð¾

0.15

Activations Density 0.125%