Hi
Is there a way i could generate a sequence number in parallel jobs as we do in Server jobs using KeyManagementGetNextValue.
It should pick the last generated sequence number from prevoius run and start from there.
Thanks
Abhinav
Generate Sequence Number
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Parallel jobs have the surrogate key generation stage, you can also use a counter in a transformation stage as shown in the FAQ forum. Both methods will ensure you have unique generated numbers across your instances.
You have to make sure you have a correct starting value for your sequence. You can do this at the sequence job level by retrieving the current maximum value from the target database table via a shell script and then passing that into your parallel job as a job parameter. Use the job parameter as the initial value of your counter or surrogate key stage.
You have to make sure you have a correct starting value for your sequence. You can do this at the sequence job level by retrieving the current maximum value from the target database table via a shell script and then passing that into your parallel job as a job parameter. Use the job parameter as the initial value of your counter or surrogate key stage.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Re: Generate Sequence Number
How about accomplishing this at the database level, like using a sequence generator. Does this have any disadvantages compared to the one generated by the Datastage.
Re: Generate Sequence Number
That sticks a choke point in the database. You have a process individually handing out the next key ala a sequence. If you are simultaneously loading multiple pipelines of data (partitioned parallelism) into the table, they will all congest around getting that sequence. You're better off using a generator in your tool to insure that each pipeline can achieve maximum throughput because it's working with its own range of keys from which to assign and need not worry another pipeline is using the same number. Gaps in surrogate key assignments are okay, which is what you end up with because of range allocation to each pipeline.sun rays wrote:How about accomplishing this at the database level, like using a sequence generator. Does this have any disadvantages compared to the one generated by the Datastage.
And I'm not even talking about having to reference back the just assigned keys so that you can now embed them as foreign surrogate keys. Kind of silly to jam data into a database and then get it back out again for the next layer of data to load. Better to jam it all in at the end.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle