INDEX
Explanations
This neuron consistently activates on the first content word at the start of a new text segment (or paragraph), marking the beginning of passages.
New Auto-Interp
Negative Logits
Appointment
-0.07
[OF
-0.06
processors
-0.06
|^
-0.06
Odkazy
-0.06
pwd
-0.06
cie
-0.06
subdir
-0.06
Velvet
-0.06
thưởng
-0.06
POSITIVE LOGITS
<↵
0.07
inaire
0.07
’elle
0.07
trib
0.07
صنایع
0.07
Indicates
0.06
(parent
0.06
Corporation
0.06
:A
0.06
CSS
0.06
Activations Density 0.046%