INDEX
Explanations
This neuron strongly activates on Markdown formatting elements, particularly the `**` used for bolding and the `*` used for list items, often when they introduce new sections or bullet points in a structured text.
New Auto-Interp
Negative Logits
them
0.67
but
0.59
or
0.58
etc
0.57
,
0.54
plus
0.54
me
0.54
ones
0.54
others
0.54
e
0.51
POSITIVE LOGITS
Although
1.45
Despite
1.43
While
1.42
Unlike
1.34
During
1.31
The
1.27
Although
1.26
Since
1.25
As
1.24
Traditionally
1.23
Activations Density 5.436%