SortMerge Collector

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bvishwanathr
Participant
Posts: 11
Joined: Wed Dec 20, 2006 8:20 am
Location: Hyderabad

SortMerge Collector

Post by bvishwanathr »

Hi,

I wanted to get a clarification regarding the SortMerge Collector method. Here is what I intend to do in my job : I want to combine/merge four sorted input files having the same column definition into one sorted output file i.e, the output file should be fully sorted. I am sure there are multiple methods to achieve this but here is what I am doing:

PX Job : Using a multiple node config file

Code: Select all

SeqFl Stage1 ---> SeqFl Stage 2

Properties (Tab) of 'SeqFl Stage1':
Source:
 File=inputfile1
 File=inputfile2
 File=inputfile3
 File=inputfile4
 ReadMethod=Specific File(s)

Properties (Tab) of 'SeqFl Stage2':
Target:
 File=sortedoutputfile
 File Update Mode=Overwrite

Partitioning (Tab) of 'SeqFl Stage2':
Collector type: SortMerge

Keys: The same keys on which the input files are sorted on.
Conceptually I feel this should work taking into consideration that the multiple input files mentioned in the input seq file stage are read one after the other. I have tested this job with small number of records and it works.

Can some one please comment/calrify on this design.
+Mplado
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That is precisely what SortMerge collector does.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That is precisely what SortMerge collector does.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply