Writing to the same file in the same job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
abc123
Premium Member
Premium Member
Posts: 605
Joined: Fri Aug 25, 2006 8:24 am

Writing to the same file in the same job

Post by abc123 »

I would like to append lines to the same sequential file from 2 different stages in the same job. In my current job, the second stage does not write eventhough I have set the update mode append. If the first stage writes 5 lines, I would like the second one to append, for example, another 3 lines so the file will end up with 8. I already know of 2 techniques:

1)Write to 2 files and create another job to union the files.
2)Do a Unix command to do it as an after-job ExecSH routine.

I was wondering if it is possible to do it in Datastage in one job.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Multiple writes to the same file is not allowed by the OS. You will get misaligned data.
If this were a server job and you had two links going in then, theoratically, there will be roundrobin process of sending records and in both the sequential file stages the option should be append. I guess you can try the same in px job by running on a single node.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

DSguru2B wrote:Multiple writes to the same file is not allowed by the OS. You will get misaligned data.
If this were a server job and you had two links going in then, theoratically, there will be roundrobin process of sending records and in both the sequential file stages the option should be append. I guess you can try the same in px job by running on a single node.
You've just contradicted yourself. As noted, multiple writers are not supported - period. This is the nature of sequential media - no Server or Parallel job or OS process can break that rule. It just The Way It Works. There is no 'round robin' process.

The only way one job could write to a file twice is to ensure the first process completes in its entirety before the second process ever starts. Then the second can append data to the end of the first process's work. Otherwise, write to two files and concatenate post job.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I know I contradicted myself as the second thought came up.
When you have two links going out to the same file and you have links ordered, then a single row will go through the first link first and then to the second link (round robin). This way each record will be appended to a file, one at a time. It will be two seperate operations for the OS, but for the naked eye, a single process.
I have not tried it but theoratically it should work as they are considered two seperate processes by the OS.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

DSguru2B wrote:When you have two links going out to the same file and you have links ordered, then a single row will go through the first link first and then to the second link (round robin). This way each record will be appended to a file, one at a time. It will be two seperate operations for the OS, but for the naked eye, a single process.
I have not tried it but theoratically it should work as they are considered two seperate processes by the OS.
No.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Guess what, you are right, funny ideas keep popping in my head :oops:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

If this were a server job, just use the link collector stage to let 1 or more transformer stages write to the same seq file.

What about a funnel stage in PX (my PX naivete may be showing here)
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
mctny
Charter Member
Charter Member
Posts: 166
Joined: Thu Feb 02, 2006 6:55 am

Post by mctny »

crouse wrote:If this were a server job, just use the link collector stage to let 1 or more transformer stages write to the same seq file.
...


but this will not guarantee that the second link will be an append, i.e., the records from two links will be mixed. I guess same case will occur with the funnel stage in PX
Thanks,
Chad
__________________________________________________________________
"There are three kinds of people in this world; Ones who know how to count and the others who don't know how to count !"
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

Wouldnt a job write to the same sequential file twice, if there is a delay between the two write operations ?

If the file is not written to in the same active stage ?
Be nice to nerds. Chances are you’ll end up working for one.

--- Bill Gates
mctny
Charter Member
Charter Member
Posts: 166
Joined: Thu Feb 02, 2006 6:55 am

Post by mctny »

ady wrote:Wouldnt a job write to the same sequential file twice, if there is a delay between the two write operations ?

If the file is not written to in the same active stage ?
As Craig said, the update/read/write operations to a sequential files are governed by the OS. if you open a sequential file for read/write /update then you cannot reopen it unless you closed it no matter what application you are using whether it be DataStage or a programming language.
Thanks,
Chad
__________________________________________________________________
"There are three kinds of people in this world; Ones who know how to count and the others who don't know how to count !"
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

I have a server job , the design is


Transformer------seqfile----------Transformer---------Seqfile(the same file)

It works fine, Is it not possible in parallel only ?
Be nice to nerds. Chances are you’ll end up working for one.

--- Bill Gates
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ady wrote:Wouldnt a job write to the same sequential file twice, if there is a delay between the two write operations ?

If the file is not written to in the same active stage ?
Yes, as I explained earlier, the two writer processes must run in a serial fashion, one after the other. Once the first completes and closes the file, the second can open, seek to the end and append.

Multiple readers, single writers.

Your job design is exactly what I meant.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

Yup ...Yup ..... got it now..... was confused earlier, because I have a few job which depend on that design :shock: :wink:
Be nice to nerds. Chances are you’ll end up working for one.

--- Bill Gates
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Use a Funnel Stage

Post by asorrell »

Use a Funnel Stage prior to your output file. On Stage Properties select "Funnel Type = Sequence".

Per the help text: Sequence copies all records from the first input data set to the output data set, then all the records from the second input data set, etc.

This would allow you to "drain" the first input, then append the second input.

I believe that is what you wanted, correct?
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply