Stage Memory usage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
shrey3a
Premium Member
Premium Member
Posts: 234
Joined: Sun Nov 21, 2004 10:41 pm

Stage Memory usage

Post by shrey3a »

Hi Gurus,

I'm in middle of designing the process and confused which stage to use i.e.
Join vs Merge , i can achieve the result from both the stages but we will be processing loads of data and trying to join/ merge 7-8 links with same key value.

I wanted to know which of the stage will use less memory and will be faster. Should I use Join stage or merge stage

Regards,
wesd
Participant
Posts: 22
Joined: Mon Aug 16, 2004 8:56 pm

Re: Stage Memory usage

Post by wesd »

shrey3a wrote:Hi Gurus,

I'm in middle of designing the process and confused which stage to use i.e.
Join vs Merge , i can achieve the result from both the stages but we will be processing loads of data and trying to join/ merge 7-8 links with same key value.

I wanted to know which of the stage will use less memory and will be faster. Should I use Join stage or merge stage

Regards,
Depends on whether you want to have reject links or not. Join has great performance but a Merge will allow you to use rejects for unmatched columns and error tracking.
Wes Dumey
Senior Consultant
Data Warehouse Projects
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Behaviour is different if there are duplicates on the inputs. In a Merge stage rows are consumed from the Update inputs.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Why not do performance testing on both designs? If you have the core of the job built it should be easy providing a version with each stage in it. So much depends on your scratch space, I/O, RAM utilisation across the rest of the job, RAM to CPU ration etc etc etc. The only firm answer I can give you is to test both to find out.
Post Reply