Change Capture with 1.5 million rows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vercama
Participant
Posts: 11
Joined: Thu Aug 07, 2008 3:21 am

Change Capture with 1.5 million rows

Post by vercama »

Hi all,
I've never used the Change Capture stage in a PX, but I know that it can be used to extract the differences between two tables, in particular the T1-T2 (probably it would need to run twice, T1-T2 and T2-T1, to get all the differences). What about performances? I had to check whether or not I can use this stage for two compatible tables with even 1.5 million rows, with differences may be of 150 rows only. Do you think it's reasonable to use this stage in this case?

Thanks,
Marco
kris007
Charter Member
Charter Member
Posts: 1102
Joined: Tue Jan 24, 2006 5:38 pm
Location: Riverside, RI

Post by kris007 »

You can use Change capture stage without any problem as long as you partition and sort the data. You could also use Merge Stage and collect the rejects in a different link.
Kris

Where's the "Any" key?-Homer Simpson
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It's reasonable (to use Change Capture stage), and you only need one pass.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vercama
Participant
Posts: 11
Joined: Thu Aug 07, 2008 3:21 am

Performances?

Post by vercama »

But what about performances with so many rows? The current solution is that if the count() in the two tables are different of at least one row, then target table is truncated and then everything is loaded from scratch.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1.5 million is not a lot of rows for a parallel job. Go for it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply