Can I use Change capture for this ?
Moderators: chulett, rschirm, roy
Can I use Change capture for this ?
Hi,
I have a question about how change capture works. Does change capture compare rows in the order they are located on the file, or does it look if that row exists anywhere in the file ?
let me explain ...
I have 2 versions of the same job ..... one is a server job and one is a parallel job ( they perform the same joins ). But I get about 2000 rows more than the server job in the parallel. when i do a sort on the data and perform change capture, it shows all rows as edited.
Is this b'coz the rows are sorted in a different order on both the input data because of the excess rows ? Is there a way I can use change capture here ? ..... or can I use a Lookup?
plz help me out with this ..
I have a question about how change capture works. Does change capture compare rows in the order they are located on the file, or does it look if that row exists anywhere in the file ?
let me explain ...
I have 2 versions of the same job ..... one is a server job and one is a parallel job ( they perform the same joins ). But I get about 2000 rows more than the server job in the parallel. when i do a sort on the data and perform change capture, it shows all rows as edited.
Is this b'coz the rows are sorted in a different order on both the input data because of the excess rows ? Is there a way I can use change capture here ? ..... or can I use a Lookup?
plz help me out with this ..
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured that this is the case?
Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).
Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ray,ray.wurlod wrote:In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured ...
Thanks for the info. Happened to try out Change Capture for the first time[after trying out Difference Stage, and also Compare Stage for one of our Compare process].
If there's a Field value difference between 'before' and 'after' dataset, would Change Capture show both the 'before' and 'after' records along with their values ?? I'm having hard-time getting this to work.
Thanks
Vijay
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Change Capture will not yield old and new. Its output is a "change" Data Set that can be reapplied to the "before" Data Set to reproduce the "after" Data Set using a Change Apply stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod wrote:In the parallel job the inputs to the Change Capture stage need not only to be sorted identically on the comparison keys but also hash partitioned identically on the comparison keys. Have you ensured that this is the case?
Change Capture stage will report whether the second input row is an edit of the first, the same as the first, does not exist in the first (an insert) or does not exist in the second (a delete).
Ray,
I did sort the data identically on 1 key and hash partitioned it. It dosent seem to be working for me.
I have a question. I have 2 data sets with 1 column and data like:
1
2
3
5
6
7
8
and
1
2
3
4
5
6
7
8
when I do a change capture on these 2 data sets it should give out the value "4" as the only changed (insert) row right ?
or will it compare the datasets as 3 to 4, 4 to 5, 5 to 6,7 to 8 and finally 8 in the second dataset as an insert?. Theoritically it should give out only 4 as the change ..... am i right ?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
While being aware of the Drop... properties that Uma suggests, which all default to False, why don't you experiment?
There is a match for Key column where value is one of 5,6,7,8. The Compare stage is not positional - it requires the Key column to match and then performs a comparison on all change columns. In your case, because there is a match on Key and no change columns, the Change Capture stage does not include these rows in a "change" Data Set on its output.
There is a match for Key column where value is one of 5,6,7,8. The Compare stage is not positional - it requires the Key column to match and then performs a comparison on all change columns. In your case, because there is a match on Key and no change columns, the Change Capture stage does not include these rows in a "change" Data Set on its output.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.