different outputs from difference stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
emma
Premium Member
Premium Member
Posts: 95
Joined: Fri Aug 08, 2003 10:30 am
Location: Montreal

different outputs from difference stage

Post by emma »

I have 2 dataset inputs and a difference stage.
Every time I'm running the job it gives me another number of output rows.

The Difference stage input is partitioned on Hash type and sorted by keys.

What am I doing wrong?
Thanks,
Emma
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are the data sorted?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mgendy
Premium Member
Premium Member
Posts: 44
Joined: Thu Sep 10, 2009 5:30 am
Contact:

Post by mgendy »

Check that the data is sorted and partritioned with all difference keys , use the proper partitioning method , hash partitioning is recommended if you have multiple difference keys
Last edited by mgendy on Tue Jan 12, 2010 2:24 am, edited 1 time in total.
Mohmmed Elgendy
Senior System Analyst
Data IntegrationTeam
Etisalat Egypt
+20 1118511161
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Presumably the input Data Sets are not changing between runs?

Can you give a couple of examples of input and output row counts? For example:

Code: Select all

Run   Before   After   Output
 1      2342    2344      412
 2      2342    2344      414
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply