file merge

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
bdixon
Participant
Posts: 35
Joined: Thu Nov 20, 2003 5:45 pm
Location: Australia, Sydney

file merge

Post by bdixon »

Hi All,

Does anyone know how to simply merge 2 files together?
For example I have 2 files with exactly the same format and i want to simply add one file to the other.
input file1:
1,test,100.00
input file2:
2,test2,125.00

output file3 (result of the merge)
1,test,100.00
2,test2,125.00

Brad
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well, there is a Merge stage which 'joins' two sequential file sources, but I don't believe it is available in the version you have.

There is also a Filter option in the Sequential File stage, but again I don't recall if it was available in the 5.x version of the product. It would let you issue an operating system command to send the two files to 'standard out' and thus bring them into the stage as if they were already merged. On a UNIX server, you'd use cat for something like this.

At worst case, leverage your operating system with a Before Job call to ExecDOS (or is it still ExecSH on a Windows server?) and the copy command to concatenate the two files together. The new resulting file can then be processed by your job.
-craig

"You can never have too many knives" -- Logan Nine Fingers
bdixon
Participant
Posts: 35
Joined: Thu Nov 20, 2003 5:45 pm
Location: Australia, Sydney

Post by bdixon »

i do have a merge stage available but i cannot seem to figure out how to use it.... is there any help/documentation available on this?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I note that you are on version 5.x and am not sure that the Merge stage is available. Note, also, that there is a server Merge stage and a parallel Merge stage, which are completely different from each other.

The server Merge stage has no input links. You provide the pathnames of two source files as stage properties, and how and on what keys you want to merge them. The various classical join types (inner, left outer, etc.) are available. The server Merge stage has one output link, which contains the result.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bdixon
Participant
Posts: 35
Joined: Thu Nov 20, 2003 5:45 pm
Location: Australia, Sydney

Post by bdixon »

i have used the merge stage in DS 5.2 but I cannot get my output to look like below:
1,test,100.00
2,test2,125.00

it comes out
1,test,100.00,,,
,,,2,test2,125.00

any ideas?
talk2shaanc
Charter Member
Charter Member
Posts: 199
Joined: Tue Jan 18, 2005 2:50 am
Location: India

Post by talk2shaanc »

Well guys, from the example he has given i think he is looking for merge(appending one file below the other), not joining which can be done through merge stage.
But for appending , either use link-collector or you will have to write a routine.

I dont know if the server version you are having has link collector or not.

Code: Select all

seq1------> |
                 | link collector-----------> next stage
seq2------->|
Last edited by talk2shaanc on Mon May 30, 2005 7:07 am, edited 1 time in total.
Shantanu Choudhary
dsxdev
Participant
Posts: 92
Joined: Mon Sep 20, 2004 8:37 am

Post by dsxdev »

Hi,
Here the topic is about appending a file to another and not merging of files based on akey.
So the job design would be like read from both the files and collect the data and write to a file.
Happy DataStaging
Deepak Tyagi
Participant
Posts: 1
Joined: Fri Mar 12, 2004 4:51 am

Post by Deepak Tyagi »

Hi All

I think from the example he has given it is clear that he is looking for appending one file below the other.According to him file format is also same.In that case you can read from 1 file and write into the 2nd file by using the "Append to existing file" option available on general tab.

seqfile1 ----------> seqfile2 (Append to existing file).
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You can do a pre-job script of 'type file2 >> file1' and use file1 as your source.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The Merge stage was a bad piece of advice on my part if the metadata matches between the two files. But then, if all you are trying to do is concatenate two identically formatted files together, how is that a job for DataStage? Sure, you can write one, but why do it when a simple call to your operating system gets it done for you? Sure, DataStage is a lovely hammer, but that doesn't make every problem a nail that needs to be whacked with it. :wink:

Plus I personally would never advocate concatenating one of the files onto the end of the other if the goal was then to immediately process the combined files. That methodolgy doesn't allow for any restart or recovery in the event of an error or a need to rerun the job. This is why I specifically mentioned concatenating the two files to a third name and then processing that third file.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

bdixon's most recent post still doesn't specify the desired output. However, the data example given earlier suggests that these files are not good candidates for the Merge stage, since there is no apparent key column on which to specify the join.

Can we see the exact output requirement, before leaping in with more suggestions?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bdixon
Participant
Posts: 35
Joined: Thu Nov 20, 2003 5:45 pm
Location: Australia, Sydney

Post by bdixon »

Sorry guys the topic heading was a bit miss leading.
I actually want to append the data of one file to another file producing a third file
for example:
input file1:
1,test,100.00
input file2:
2,test2,125.00
output file3 (result of the append)
1,test,100.00
2,test2,125.00

What is the best way to do this? I am gussing it would be a subroutine?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Depends, somewhat, on what happens next with the new file. Going back to one of the original suggestions - use the copy command:

Code: Select all

copy file1+file2 file3
The file noted as 'file3' becomes the product of the first two files concatenated together. Do this 'before job' to a fixed filename for processing in the job if that is what you had in mind. Or in a routine using DSExecute if you prefer, for use in a Sequencer or the like.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply