INDEX

Explanations

protection, protect, protective

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

çİĸ

-0.10

å§¨

-0.09

èĻļ

-0.09

äº²çĪ±

-0.09

ç©

-0.09

IRA

-0.09

è¡Įéķ¿

-0.09

chip

-0.09

uez

-0.08

é«ĺæ¶¨

-0.08

POSITIVE LOGITS

ä¿ĿæĬ¤

0.21

 protection

0.18

ä¿ĿèŃ·

0.17

 protect

0.16

 Protection

0.16

 protects

0.15

 against

0.14

protect

0.14

 protecting

0.14

Protection

0.13

Activations Density 0.090%