Page 1 of 1

Can I use Change capture for this ?

Posted: Mon Feb 12, 2007 9:43 am
by ady
Hi,

I have a question about how change capture works. Does change capture compare rows in the order they are located on the file, or does it look if that row exists anywhere in the file ?

let me explain ...


I have 2 versions of the same job ..... one is a server job and one is a parallel job ( they perform the same joins ). But I get about 2000 rows more than the server job in the parallel. when i do a sort on the data and perform change capture, it shows all rows as edited.

Is this b'coz the rows are sorted in a different order on both the input data because of the excess rows ? Is there a way I can use change capture here ? ..... or can I use a Lookup?

plz help me out with this ..

Posted: Mon Feb 12, 2007 3:00 pm
by ray.wurlod
In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured that this is the case?

Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).

Posted: Mon Feb 12, 2007 6:43 pm
by vijayrc
ray.wurlod wrote:In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured ...
Ray,
Thanks for the info. Happened to try out Change Capture for the first time[after trying out Difference Stage, and also Compare Stage for one of our Compare process].
If there's a Field value difference between 'before' and 'after' dataset, would Change Capture show both the 'before' and 'after' records along with their values ?? I'm having hard-time getting this to work.
Thanks
Vijay

Posted: Mon Feb 12, 2007 7:03 pm
by DSguru2B
Change Capture Stage will capture changes as codes in an additional column. This extra column will have codes. Read about it in DS help.

Posted: Mon Feb 12, 2007 8:43 pm
by ray.wurlod
Change Capture will not yield old and new. Its output is a "change" Data Set that can be reapplied to the "before" Data Set to reproduce the "after" Data Set using a Change Apply stage.

Posted: Tue Feb 13, 2007 9:00 am
by ady
ray.wurlod wrote:In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured that this is the case?

Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).

Ray,

I did sort the data identically on 1 key and hash partitioned it. It dosent seem to be working for me.

I have a question. I have 2 data sets with 1 column and data like:

1
2
3
5
6
7
8


and


1
2
3
4
5
6
7
8


when I do a change capture on these 2 data sets it should give out the value "4" as the only changed (insert) row right ?

or will it compare the datasets as 3 to 4, 4 to 5, 5 to 6,7 to 8 and finally 8 in the second dataset as an insert?. Theoritically it should give out only 4 as the change ..... am i right ?

HI

Posted: Tue Feb 13, 2007 10:35 am
by umamahes
If you set Drop out for edit = false
and Drop out for insert=false

It will give only the changed and new record

Thanks
Uma

Posted: Tue Feb 13, 2007 2:41 pm
by ray.wurlod
While being aware of the Drop... properties that Uma suggests, which all default to False, why don't you experiment?

There is a match for Key column where value is one of 5,6,7,8. The Compare stage is not positional - it requires the Key column to match and then performs a comparison on all change columns. In your case, because there is a match on Key and no change columns, the Change Capture stage does not include these rows in a "change" Data Set on its output.

Posted: Sat Feb 24, 2007 8:41 am
by HSBCdev
What do you have 'Change Mode' set to? Have you explicitly defined your keys and your values?