INDEX

Explanations

potential offensive statements

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

bass

-0.08

 toilets

-0.08

 확보

-0.07

lange

-0.07

 ежегод

-0.07

 record

-0.07

pod

-0.07

 Чем

-0.07

 Verified

-0.07

 pharmacies

-0.07

POSITIVE LOGITS

 ofens

0.13

 offend

0.13

 offended

0.13

 inadvert

0.13

 offending

0.12

 interpretations

0.12

 harassment

0.12

 sarcas

0.12

 offensive

0.11

 unintended

0.11

Activations Density 0.069%