INDEX

Explanations

pronoun followed by 'are' or 'is'

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

10,000 prompts, 128 tokens each

Dataset (Dashboard)

lmsys/lmsys-chat-1m

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

¶Į

-0.12

itia

-0.09

éĺħè¯»æ¬¡æķ°

-0.09

Â©Â©

-0.08

å¥½çļĦ

-0.08

ãģ£ãģ¨

-0.08

Â°}

-0.08

leine

-0.08

ÙģØ±Ø§ÙĨ

-0.08

POSITIVE LOGITS

're

0.17

’;re

0.15

've

0.13

are

0.12

’;ve

0.11

 ï¾Ħ

0.10

 Ã¤r

0.10

'm

0.10

 adalah

0.10

 Cannot

0.10

Activations Density 0.202%