Can I use Change capture for this ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Can I use Change capture for this ?

Post by ady »

Hi,

I have a question about how change capture works. Does change capture compare rows in the order they are located on the file, or does it look if that row exists anywhere in the file ?

let me explain ...


I have 2 versions of the same job ..... one is a server job and one is a parallel job ( they perform the same joins ). But I get about 2000 rows more than the server job in the parallel. when i do a sort on the data and perform change capture, it shows all rows as edited.

Is this b'coz the rows are sorted in a different order on both the input data because of the excess rows ? Is there a way I can use change capture here ? ..... or can I use a Lookup?

plz help me out with this ..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured that this is the case?

Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vijayrc
Participant
Posts: 197
Joined: Sun Apr 02, 2006 10:31 am
Location: NJ

Post by vijayrc »

ray.wurlod wrote:In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured ...
Ray,
Thanks for the info. Happened to try out Change Capture for the first time[after trying out Difference Stage, and also Compare Stage for one of our Compare process].
If there's a Field value difference between 'before' and 'after' dataset, would Change Capture show both the 'before' and 'after' records along with their values ?? I'm having hard-time getting this to work.
Thanks
Vijay
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Change Capture Stage will capture changes as codes in an additional column. This extra column will have codes. Read about it in DS help.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Change Capture will not yield old and new. Its output is a "change" Data Set that can be reapplied to the "before" Data Set to reproduce the "after" Data Set using a Change Apply stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ady
Premium Member
Premium Member
Posts: 189
Joined: Thu Oct 12, 2006 12:08 am

Post by ady »

ray.wurlod wrote:In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured that this is the case?

Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).

Ray,

I did sort the data identically on 1 key and hash partitioned it. It dosent seem to be working for me.

I have a question. I have 2 data sets with 1 column and data like:

1
2
3
5
6
7
8


and


1
2
3
4
5
6
7
8


when I do a change capture on these 2 data sets it should give out the value "4" as the only changed (insert) row right ?

or will it compare the datasets as 3 to 4, 4 to 5, 5 to 6,7 to 8 and finally 8 in the second dataset as an insert?. Theoritically it should give out only 4 as the change ..... am i right ?
umamahes
Premium Member
Premium Member
Posts: 110
Joined: Tue Jul 04, 2006 9:08 pm

HI

Post by umamahes »

If you set Drop out for edit = false
and Drop out for insert=false

It will give only the changed and new record

Thanks
Uma
HI
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

While being aware of the Drop... properties that Uma suggests, which all default to False, why don't you experiment?

There is a match for Key column where value is one of 5,6,7,8. The Compare stage is not positional - it requires the Key column to match and then performs a comparison on all change columns. In your case, because there is a match on Key and no change columns, the Change Capture stage does not include these rows in a "change" Data Set on its output.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
HSBCdev
Premium Member
Premium Member
Posts: 141
Joined: Tue Mar 16, 2004 8:22 am
Location: HSBC - UK and India
Contact:

Post by HSBCdev »

What do you have 'Change Mode' set to? Have you explicitly defined your keys and your values?
Post Reply