merge data from two files

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_user78
Participant
Posts: 23
Joined: Thu Nov 11, 2004 5:39 pm

merge data from two files

Post by ds_user78 »

hi, i have two files f1 and f2 which looks like this.

f1
c1,c2, ..
100,1,..
100,2,..
100,3,..
200,1,..
200,2,..
300,1,..

and f2
c1,c3,..
100,1000,..
100,2000,..
300,1000,..
300,2000,..

the output file should be
c1,c2,c3,
100,1,1000,..
100,2,2000,..
100,3,NULL,..
200,1,NULL,..
200,2,NULL,..
300,1,1000,...
300,NULL,2000,..

As you can see the data from the two files have to be 'merged' together. How ever i try with hash files i end up having something like a cartesian product. Would it be possible to do it DS. The c3 column in f2 file will have only 2 values 1000,2000 if that will help.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You can include an imaginative derived column which can be populated by a running sequence number for that group and generated using a stage variable with the check of (if linkKey <> prevKey Then 1 Else DerSeq + 1)

By this way, your second file will become something like
and f2
c1,derived_seq, c3,..
100,1,1000,..
100,2,2000,..
300,1,1000,..
300,2,2000,..

and as your first file already has some sequence number in form of c2 (otherwise you can create a derived seq for that too), you can use the original key and derived seq number combination to be the new key in reference them.

If you need more information, just shout.
Post Reply