INDEX

Explanations

equivalent phrases

This neuron detects occurrences of the word “equivalent,” especially where it’s used with numerical values.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 delightfully

-2.81

癭

-2.78

 dando

-2.77

-2.69

-2.63

 serán

-2.61

 shimmering

-2.59

-2.58

ort

-2.52

 pivotal

-2.50

POSITIVE LOGITS

 strives

2.77

 tries

2.77

 undeniably

2.70

 sizable

2.63

 That

2.58

 blatantly

2.58

 Seems

2.56

陞

2.55

!!!!!

2.55

 creates

2.53

Activations Density 0.008%