INDEX

Explanations

annoying, irritating, frustrated

The neuron fires strongly on words that express annoyance or irritation (e.g., “annoying,” “irritating,” “pisses off”).

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

-2.77

 kanske

-1.65

 nocturna

-1.61

ၞ

-1.49

 With

-1.48

 multitude

-1.47

kinci

-1.39

剤

-1.38

ns

-1.37

–

-1.36

POSITIVE LOGITS

 всички

1.88

妧

1.79

 MILLION

1.77

medriver

1.72

 wszystkie

1.66

閦

1.63

 obligación

1.63



1.60

 pudieron

1.59

争议

1.59

Activations Density 0.024%