INDEX

Explanations

allowed to say anything

This neuron detects expressions of personal freedom or autonomy (e.g., words like “allowed,” “free,” “operate,” “autonomy,” “freedom”).

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 невозможно

0.82

Sadly

0.80

heureusement

0.78

 સમયે

0.77

 अशुभ

0.77

 दुर्भाग्य

0.77

才知道

0.77

 તમારા

0.75

ㅠ

0.74

 действу

0.74

POSITIVE LOGITS

 freely

1.42

自由に

1.33

 unlimited

1.26

 unrestricted

1.25

自由

1.19

 자유

1.16

 Unlimited

1.13

 свободно

1.13

 libre

1.13

随意

1.11

Activations Density 1.429%