INDEX

Explanations

myths

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

得

-0.08

 kubona

-0.08

宝

-0.07

 लेते

-0.07

 escolher

-0.07

 recib

-0.07

 encuent

-0.07

 combinar

-0.07

 الله

-0.07

 bord

-0.07

POSITIVE LOGITS

 misconceptions

0.27

 misconception

0.25

 misinformation

0.19

 miscon

0.18

 misunderstanding

0.18

 misunderstand

0.17

 misunderstood

0.17

 misleading

0.16

 myths

0.15

 mistaken

0.14

Activations Density 0.046%