how and which stages are used to main version in SCD 2
Moderators: chulett, rschirm, roy
how and which stages are used to main version in SCD 2
Hi all,
i want to implement SCD 2 in parallel job. i have tried using
change capture and change apply stage, but using these stage i am not able to maintain version control.
my requirement is that if
1) new record in the source -> then insert it into target with new row id
2) changed record in the source -> then insert this record into the target as well as update the exisiting record with new time(end time) to main history.
Any help would be greatly appreciated. Thanks!
rafiq
i want to implement SCD 2 in parallel job. i have tried using
change capture and change apply stage, but using these stage i am not able to maintain version control.
my requirement is that if
1) new record in the source -> then insert it into target with new row id
2) changed record in the source -> then insert this record into the target as well as update the exisiting record with new time(end time) to main history.
Any help would be greatly appreciated. Thanks!
rafiq
Hi Rafiq,
The inputs to the CDC should be your source and target. The output of the CDC should lead to a transformer. Have 2 constraints in the transformer
1. change_code = 1 or change_code= 3
2. change_code = 3
From the transformer use the 1st constraint and lead the records to a funnel. From the transformer use the 2nd constraint and lead the records to a join stage and join with the target records. The output of the join stage should go to the same funnel. Now the output of the funnel will have both inserts and updates.
HTH
--Rich
The inputs to the CDC should be your source and target. The output of the CDC should lead to a transformer. Have 2 constraints in the transformer
1. change_code = 1 or change_code= 3
2. change_code = 3
From the transformer use the 1st constraint and lead the records to a funnel. From the transformer use the 2nd constraint and lead the records to a join stage and join with the target records. The output of the join stage should go to the same funnel. Now the output of the funnel will have both inserts and updates.
HTH
--Rich
Hi Rich,
thanks for ur quick response
As per your solution i went thrugh it. But whenever there is any insertion or updation in the source file(text file), the conditions in the transformer become false. when i give condition like change_code=2 then it become false?
which is i suppose for deletetion. so what may be the reason?
my requirement is as
suppose i have source data like
NO NAME SAL
10 ABC 100
20 XYZ 200
30 CBA 300
and in the target i should have for the first time populate like this
SURROGATE_KEY NO NAME SAL VERSION
1 10 ABC 100 1000
2 20 XYZ 200 2000
3 30 CBA 300 3000
now for the next load, in the souce file gets changed like
NO NAME SAL
10 ABC 100
20 XYZ 200
30 XXX 300(changed in name)
40 SSS 400(new row)
for the next load, the target should have records like
SURROGATE_KEY NO NAME SAL VERSION
1 10 ABC 100 1000
2 20 XYZ 200 2000
3 30 CBA 300 3000
4 30 XXX 300 3001(increment corr version by 1)
5 40 SSS 400 4001
any solution or help will be greatly appreciated
Thanks in advance
Rafiq
thanks for ur quick response
As per your solution i went thrugh it. But whenever there is any insertion or updation in the source file(text file), the conditions in the transformer become false. when i give condition like change_code=2 then it become false?
which is i suppose for deletetion. so what may be the reason?
my requirement is as
suppose i have source data like
NO NAME SAL
10 ABC 100
20 XYZ 200
30 CBA 300
and in the target i should have for the first time populate like this
SURROGATE_KEY NO NAME SAL VERSION
1 10 ABC 100 1000
2 20 XYZ 200 2000
3 30 CBA 300 3000
now for the next load, in the souce file gets changed like
NO NAME SAL
10 ABC 100
20 XYZ 200
30 XXX 300(changed in name)
40 SSS 400(new row)
for the next load, the target should have records like
SURROGATE_KEY NO NAME SAL VERSION
1 10 ABC 100 1000
2 20 XYZ 200 2000
3 30 CBA 300 3000
4 30 XXX 300 3001(increment corr version by 1)
5 40 SSS 400 4001
any solution or help will be greatly appreciated
Thanks in advance
Rafiq
Rafik,
the basic approach is correct, but something seems to not be working correctly in your case. Could you also output the change_code value and see what it is for each row?
the basic approach is correct, but something seems to not be working correctly in your case. Could you also output the change_code value and see what it is for each row?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
how to maintain version number in SCD 2
Thanks Arnd and Rich,
The problem has been resolved after changing the link order.
Now my problem is how to maintain the version number of each record in the target as i mentioned in the last message.
source as well as target are text files
any solution?
Thanks in advance
Rafiq
The problem has been resolved after changing the link order.
Now my problem is how to maintain the version number of each record in the target as i mentioned in the last message.
source as well as target are text files
any solution?
Thanks in advance
Rafiq
-
- Charter Member
- Posts: 166
- Joined: Wed Mar 16, 2005 6:52 am
- Location: Mumbai, India
AFAIK the only 2 operations supported by sequential files are append and overwrite.
The sequential file stage has no mechanism for updating previously written rows. if it is imperative that output be a text file, you will need to design logic that re-creates the entire file.
One thing that comes to mind is separate the updates and deletes from the inserts and copies and then lookup the original file(Stream) against the updates(Reference) (reversed this intentionally), update the end date and recombine these records with the inserts and copies and build a new file with all the records.
IHTH
The sequential file stage has no mechanism for updating previously written rows. if it is imperative that output be a text file, you will need to design logic that re-creates the entire file.
One thing that comes to mind is separate the updates and deletes from the inserts and copies and then lookup the original file(Stream) against the updates(Reference) (reversed this intentionally), update the end date and recombine these records with the inserts and copies and build a new file with all the records.
IHTH
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
Generating Surrogate Key as parameter
Hi,
can any one suggest me the Design of SCD2.
how can i generate Surr Key for Next load.
can any one suggest me the Design of SCD2.
how can i generate Surr Key for Next load.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There are at least five different ways for generating surrogate keys which you would have discovered had you bothered to Search rather than hijacking a thread that is more than two years old.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Code: Select all
1.Just perform a join on the natural key columns and seperate the new records from the old ones
2.In the next step assign the surrogate keys to the new records.
3.Third would be the update.
P.S Are my DWH concepts outdated or your talking about type 3 and not 2 here because i thought in SCD type 2 a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key.
Sorry to bring up the old post. But this is the closest that I found from my search.
I am planning to implement SCD and obtaining three files Insert, UpdateType2, UpdateType1. Using which I update the Target Table. However I have versioning in Target Table (in addition to it, I want to update based on surrogate keys due to the huge number of natural keys, which eventually would be the where clause while updating instead of surrogate key).
Is the below approach approach a good solution or do we implement SCD type 2 for versioning similar to that of Server Edition.
As with the below solution I don't see that the CDC stage is being used efficiently.
I am planning to implement SCD and obtaining three files Insert, UpdateType2, UpdateType1. Using which I update the Target Table. However I have versioning in Target Table (in addition to it, I want to update based on surrogate keys due to the huge number of natural keys, which eventually would be the where clause while updating instead of surrogate key).
Is the below approach approach a good solution or do we implement SCD type 2 for versioning similar to that of Server Edition.
As with the below solution I don't see that the CDC stage is being used efficiently.
AFAIK the only 2 operations supported by sequential files are append and overwrite.
The sequential file stage has no mechanism for updating previously written rows. if it is imperative that output be a text file, you will need to design logic that re-creates the entire file.
One thing that comes to mind is separate the updates and deletes from the inserts and copies and then lookup the original file(Stream) against the updates(Reference) (reversed this intentionally), update the end date and recombine these records with the inserts and copies and build a new file with all the records.
IHTH
-
- Participant
- Posts: 3
- Joined: Tue May 05, 2009 11:57 pm