Sequence or Unique identifer across tables in PX ?
Moderators: chulett, rschirm, roy
Sequence or Unique identifer across tables in PX ?
Hi ,
I have a scenario in which a unique identifier or a sequence number must be generated across jobs i.e. there are multiple jobs inserting on the same table and should insert the next sequence number/unique identifier per job. This unique identifier should then be passed and be inserted in TWO other tables.Tables are residing in an Oracle database.
Can someone please provide a solution/method on how one can go about this ?
I have a scenario in which a unique identifier or a sequence number must be generated across jobs i.e. there are multiple jobs inserting on the same table and should insert the next sequence number/unique identifier per job. This unique identifier should then be passed and be inserted in TWO other tables.Tables are residing in an Oracle database.
Can someone please provide a solution/method on how one can go about this ?
Last edited by Havoc on Fri Feb 02, 2007 7:59 am, edited 1 time in total.
How can we leverage the JOBNO to create a unique identifier?kduke wrote:Why not use JOBNO? This is stored in DS_JOBS. You can easily look it up based on the job name.
One more query is this JOBNO dynamic or static?
Basically there are multiple jobs and we have to avoid clashes when generating this unique identifer for this ONE table and pass same value to TWO other tables per Job.
I think using job name is better - since the JOBNO for a job may change (re-import to another project, saving of a copy, etc.).
Are you looking at something like an Oracle Sequence, but driven from DataStage?
Are you looking at something like an Oracle Sequence, but driven from DataStage?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Well that is the point but we have to load this same sequence number for three other jobs. I'll give you an example with sample data to give u a clearer understanding of what we're trying to achieve.ArndW wrote:I think using job name is better - since the JOBNO for a job may change (re-import to another project, saving of a copy, etc.).
Are you looking at something like an Oracle Sequence, but driven from DataStage?
We have Table 1 and Table 2 which are loaded in the same job. Each of these tables share one common key which has to be a unique identifier/sequence number. I'll give u an example of the table data. It
looks like below:
ID COL1 COL2 COL3
----------------------------
1 A B C
2 D E F
3 G H I
4 J K L
ID COL
-----------
1 A
1 B
1 C
2 D
2 E
2 F
The above two tables are inserted in multiple jobs and the ID column should be uniquely generated across jobs.
PX has a surrogate key stage; but it generates unique keys only within the job itself. But you can specify the starting value dynamically, i.e. through a parameter to the job. If you can think of an algorithm that ensures each job only uses a given range of numbers you can use that method. The generated key can be up to an int*64 so that gives you a lot of numbers to use.
Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance.
Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Havoc - this is a solutionArndW wrote:...Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance...
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
Hey Arnd,ArndW wrote:Havoc - this is a solutionArndW wrote:...Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance...
Just tried out the above method. But got a question..On what key can we join the input data with the Oracle sequence to get the nextval?
Tried this method out with a lookup to an Oracle stage with the SQL:
select TEST_SEQUENCE.nextval from dual;
But this resulted in all 0s in the 'nextval' field in our output sequential file.
I'll assume that the sql works from the command line. You need to specify a lookup stage and use sparse lookups; I'm not at work so can't give you the exact syntax but you need to specify a dummy input key column and use that in the SQL (so that DataStage actually performs the query for each row).
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Thanks ArndW and Kumar, this approach works. I get your point about the performance overhead but right now there is no other go.kumar_s wrote:Create a dummy column along with the Nextval, say with the constant value '1' and lookup on that. If you MINVALUE starts with negative, there is a possibility to get 0's, else no. ...