SCD In Parallel Jobs
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 31
- Joined: Wed Mar 17, 2004 3:33 am
SCD In Parallel Jobs
HI,
I am getting the records from the Source table and after doing the Look up with target I have to Insert the New records and also the Updated records in the Target table.
If i Use the Change Data capture Stage Or Difference stage then it will gives the Updated records onlyin the out put,but I have to Mainatain the History in the Target table ie I want the Existing record and also the Updated record in the Target Table.I dont have Flag Columns in the Target Tables.
Is this Possible in Parallel jobs Without Using the Transformer Logic ?
Regards
Siva
I am getting the records from the Source table and after doing the Look up with target I have to Insert the New records and also the Updated records in the Target table.
If i Use the Change Data capture Stage Or Difference stage then it will gives the Updated records onlyin the out put,but I have to Mainatain the History in the Target table ie I want the Existing record and also the Updated record in the Target Table.I dont have Flag Columns in the Target Tables.
Is this Possible in Parallel jobs Without Using the Transformer Logic ?
Regards
Siva
Last edited by shivakumar on Tue Aug 01, 2006 1:29 am, edited 1 time in total.
This topic already discussed in the earlier days.
Refer this topic:
viewtopic.php?t=102403&highlight=scd
Regards,
iamnagus.
Refer this topic:
viewtopic.php?t=102403&highlight=scd
Regards,
iamnagus.
-
- Participant
- Posts: 31
- Joined: Wed Mar 17, 2004 3:33 am
Re: SCD In Parallel Jobs
[quote="shivakumar"]HI,
I am getting the records from the Source table and after doing the Look up with target I have to Insert the New records and also the Updated records in the Target table.
If i Use the Change Data capture Stage Or Difference stage then it will gives the Updated records onlyin the out put,but I have to Mainatain the History in the Target table ie I want the Existing record and also the Updated record in the Target Table.I dont have Flag Columns in the Target Tables.
Is this Possible in Parallel jobs Without Using the Transformer Logic ?
Regards
Siva[/I dont have the Date fields in the Target Table,Here I am asking that How do I get the Updated and Existing Records without a Transformation Logic].Is there any Stages available to get the Updateed records and also the Previous records in the Output.
I am getting the records from the Source table and after doing the Look up with target I have to Insert the New records and also the Updated records in the Target table.
If i Use the Change Data capture Stage Or Difference stage then it will gives the Updated records onlyin the out put,but I have to Mainatain the History in the Target table ie I want the Existing record and also the Updated record in the Target Table.I dont have Flag Columns in the Target Tables.
Is this Possible in Parallel jobs Without Using the Transformer Logic ?
Regards
Siva[/I dont have the Date fields in the Target Table,Here I am asking that How do I get the Updated and Existing Records without a Transformation Logic].Is there any Stages available to get the Updateed records and also the Previous records in the Output.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
There are two ways to get and apply your change capture. You start with a Before set of data and an After set of data. If you use the change capture stage alone it gives you new/changed/deleted/unchanged records. There is a flag to turn each one of these on or off and an extra field is written out that indicates what type of change it is.
You can now either apply this to a target database table using insert/update/delete/load stages and transformers and filters OR you can merge it with your Before set of data using the Change Apply stage.
The Change Apply will give you the new changes and the old history.
You can now either apply this to a target database table using insert/update/delete/load stages and transformers and filters OR you can merge it with your Before set of data using the Change Apply stage.
The Change Apply will give you the new changes and the old history.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
The difficulty in SCD in parallel is the complicated nature of a type 1, 2, and hybrid 1-2. You need to transform a new row and compare it to the current row. You may compare just a subset of all columns. This means that if you have 40 columns, maybe only 35 pertain to looking for a difference. You may even use CRC32 to facilitate this comparison but computing CRC32 values for the column groups in current and new and then do an equivalency test.
In addition, the hybrd 1-2 type means a smaller subset of columns cause an update, and another subset of columns cause an insert of the next variant. In addition, in type 1 handling you potentially update the current row with an effective end date and optionally remove a current indicator flag.
If there is any chance that you may have time ordered data and need to detect a repeating natural key, then you would potentially have a whole family of SCD rows to generate for the same natural key in the correct order. This would require some device to stage/store the rows as you assemble them. PX has no storage mechanism for holding rows and updating them. .ds datasets are write once, you cannot reference a row from it and modify it. About the only method is a custom buildop in this case.
The combination of stages required to handle SCD type logic, as well as deal with all of the various potential inserts and update streams of rows make SCD in a parallel job a daunting task. If you have to deal with repeating natural keys of time ordered data, you're now outside the tool and writing a buildop. For this reason, unless performance requires it, you should consider using a Server job and hashed file stages. You can easily store the current row in a hashed file, transform the new row, reference and compare to the current row, and optionally overwrite the current row with a newer row or insert a new row as well as retire/update the current row. Then, extract from your staging hashed file your inserts and updates and then load to your target table.
In addition, the hybrd 1-2 type means a smaller subset of columns cause an update, and another subset of columns cause an insert of the next variant. In addition, in type 1 handling you potentially update the current row with an effective end date and optionally remove a current indicator flag.
If there is any chance that you may have time ordered data and need to detect a repeating natural key, then you would potentially have a whole family of SCD rows to generate for the same natural key in the correct order. This would require some device to stage/store the rows as you assemble them. PX has no storage mechanism for holding rows and updating them. .ds datasets are write once, you cannot reference a row from it and modify it. About the only method is a custom buildop in this case.
The combination of stages required to handle SCD type logic, as well as deal with all of the various potential inserts and update streams of rows make SCD in a parallel job a daunting task. If you have to deal with repeating natural keys of time ordered data, you're now outside the tool and writing a buildop. For this reason, unless performance requires it, you should consider using a Server job and hashed file stages. You can easily store the current row in a hashed file, transform the new row, reference and compare to the current row, and optionally overwrite the current row with a newer row or insert a new row as well as retire/update the current row. Then, extract from your staging hashed file your inserts and updates and then load to your target table.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia