DSXchange

Posted: **Thu Feb 12, 2009 10:10 am**

Hi all,
I've never used the Change Capture stage in a PX, but I know that it can be used to extract the differences between two tables, in particular the T1-T2 (probably it would need to run twice, T1-T2 and T2-T1, to get all the differences). What about performances? I had to check whether or not I can use this stage for two compatible tables with even 1.5 million rows, with differences may be of 150 rows only. Do you think it's reasonable to use this stage in this case?

Thanks,
Marco

Posted: **Thu Feb 12, 2009 12:12 pm**

You can use Change capture stage without any problem as long as you partition and sort the data. You could also use Merge Stage and collect the rejects in a different link.

Posted: **Thu Feb 12, 2009 2:17 pm**

It's reasonable (to use Change Capture stage), and you only need one pass.

Posted: **Fri Feb 13, 2009 2:25 am**

But what about performances with so many rows? The current solution is that if the count() in the two tables are different of at least one row, then target table is truncated and then everything is loaded from scratch.

Posted: **Fri Feb 13, 2009 4:23 am**

1.5 million is not a lot of rows for a parallel job. Go for it.

DSXchange

Change Capture with 1.5 million rows

Change Capture with 1.5 million rows

Performances?