Writing to the same file in the same job
Moderators: chulett, rschirm, roy
Writing to the same file in the same job
I would like to append lines to the same sequential file from 2 different stages in the same job. In my current job, the second stage does not write eventhough I have set the update mode append. If the first stage writes 5 lines, I would like the second one to append, for example, another 3 lines so the file will end up with 8. I already know of 2 techniques:
1)Write to 2 files and create another job to union the files.
2)Do a Unix command to do it as an after-job ExecSH routine.
I was wondering if it is possible to do it in Datastage in one job.
1)Write to 2 files and create another job to union the files.
2)Do a Unix command to do it as an after-job ExecSH routine.
I was wondering if it is possible to do it in Datastage in one job.
Multiple writes to the same file is not allowed by the OS. You will get misaligned data.
If this were a server job and you had two links going in then, theoratically, there will be roundrobin process of sending records and in both the sequential file stages the option should be append. I guess you can try the same in px job by running on a single node.
If this were a server job and you had two links going in then, theoratically, there will be roundrobin process of sending records and in both the sequential file stages the option should be append. I guess you can try the same in px job by running on a single node.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
You've just contradicted yourself. As noted, multiple writers are not supported - period. This is the nature of sequential media - no Server or Parallel job or OS process can break that rule. It just The Way It Works. There is no 'round robin' process.DSguru2B wrote:Multiple writes to the same file is not allowed by the OS. You will get misaligned data.
If this were a server job and you had two links going in then, theoratically, there will be roundrobin process of sending records and in both the sequential file stages the option should be append. I guess you can try the same in px job by running on a single node.
The only way one job could write to a file twice is to ensure the first process completes in its entirety before the second process ever starts. Then the second can append data to the end of the first process's work. Otherwise, write to two files and concatenate post job.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I know I contradicted myself as the second thought came up.
When you have two links going out to the same file and you have links ordered, then a single row will go through the first link first and then to the second link (round robin). This way each record will be appended to a file, one at a time. It will be two seperate operations for the OS, but for the naked eye, a single process.
I have not tried it but theoratically it should work as they are considered two seperate processes by the OS.
When you have two links going out to the same file and you have links ordered, then a single row will go through the first link first and then to the second link (round robin). This way each record will be appended to a file, one at a time. It will be two seperate operations for the OS, but for the naked eye, a single process.
I have not tried it but theoratically it should work as they are considered two seperate processes by the OS.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
No.DSguru2B wrote:When you have two links going out to the same file and you have links ordered, then a single row will go through the first link first and then to the second link (round robin). This way each record will be appended to a file, one at a time. It will be two seperate operations for the OS, but for the naked eye, a single process.
I have not tried it but theoratically it should work as they are considered two seperate processes by the OS.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
crouse wrote:If this were a server job, just use the link collector stage to let 1 or more transformer stages write to the same seq file.
...
but this will not guarantee that the second link will be an append, i.e., the records from two links will be mixed. I guess same case will occur with the funnel stage in PX
Thanks,
Chad
__________________________________________________________________
"There are three kinds of people in this world; Ones who know how to count and the others who don't know how to count !"
Chad
__________________________________________________________________
"There are three kinds of people in this world; Ones who know how to count and the others who don't know how to count !"
As Craig said, the update/read/write operations to a sequential files are governed by the OS. if you open a sequential file for read/write /update then you cannot reopen it unless you closed it no matter what application you are using whether it be DataStage or a programming language.ady wrote:Wouldnt a job write to the same sequential file twice, if there is a delay between the two write operations ?
If the file is not written to in the same active stage ?
Thanks,
Chad
__________________________________________________________________
"There are three kinds of people in this world; Ones who know how to count and the others who don't know how to count !"
Chad
__________________________________________________________________
"There are three kinds of people in this world; Ones who know how to count and the others who don't know how to count !"
Yes, as I explained earlier, the two writer processes must run in a serial fashion, one after the other. Once the first completes and closes the file, the second can open, seek to the end and append.ady wrote:Wouldnt a job write to the same sequential file twice, if there is a delay between the two write operations ?
If the file is not written to in the same active stage ?
Multiple readers, single writers.
Your job design is exactly what I meant.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Use a Funnel Stage
Use a Funnel Stage prior to your output file. On Stage Properties select "Funnel Type = Sequence".
Per the help text: Sequence copies all records from the first input data set to the output data set, then all the records from the second input data set, etc.
This would allow you to "drain" the first input, then append the second input.
I believe that is what you wanted, correct?
Per the help text: Sequence copies all records from the first input data set to the output data set, then all the records from the second input data set, etc.
This would allow you to "drain" the first input, then append the second input.
I believe that is what you wanted, correct?