INDEX

Explanations

phrases that indicate intent or purpose in actions

New Auto-Interp

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 convol

-0.76

mbuds

-0.68

ouses

-0.66

comings

-0.66

 Puzz

-0.65

 expire

-0.63

 Pengu

-0.63

 throats

-0.62

 Gleaming

-0.60

 polymorph

-0.60

POSITIVE LOGITS

ertation

0.77

 mainly

0.69

Pwr

0.69

uci

0.66

hay

0.64

Tow

0.64

 primarily

0.63

IGHT

0.61

 squarely

0.61

 mostly

0.61

Activations Density 0.036%