INDEX

Explanations

abide

New Auto-Interp

Configuration

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 characteristics

-0.08

 invit

-0.08

farben

-0.08

Kry

-0.07

 Erlebnis

-0.07

깔

-0.07

-indent

-0.07

 provoking

-0.07

 caos

-0.07

 Characteristics

-0.07

POSITIVE LOGITS

限制

0.13

 ограничения

0.12

 restricciones

0.11

 beperk

0.10

 restrictions

0.10

 imposed

0.10

 censorship

0.10

Restrictions

0.10

Restr

0.10

 Restrictions

0.10

Activations Density 0.008%