Page 1 of 1

Sequence or Unique identifer across tables in PX ?

Posted: Fri Feb 02, 2007 3:00 am
by Havoc
Hi ,


I have a scenario in which a unique identifier or a sequence number must be generated across jobs i.e. there are multiple jobs inserting on the same table and should insert the next sequence number/unique identifier per job. This unique identifier should then be passed and be inserted in TWO other tables.Tables are residing in an Oracle database.

Can someone please provide a solution/method on how one can go about this ?

Posted: Fri Feb 02, 2007 7:02 am
by kduke
Why not use JOBNO? This is stored in DS_JOBS. You can easily look it up based on the job name.

Posted: Fri Feb 02, 2007 7:52 am
by Havoc
kduke wrote:Why not use JOBNO? This is stored in DS_JOBS. You can easily look it up based on the job name.
How can we leverage the JOBNO to create a unique identifier?
One more query is this JOBNO dynamic or static?

Basically there are multiple jobs and we have to avoid clashes when generating this unique identifer for this ONE table and pass same value to TWO other tables per Job.

Posted: Fri Feb 02, 2007 7:56 am
by ArndW
I think using job name is better - since the JOBNO for a job may change (re-import to another project, saving of a copy, etc.).
Are you looking at something like an Oracle Sequence, but driven from DataStage?

Posted: Fri Feb 02, 2007 9:08 am
by Havoc
ArndW wrote:I think using job name is better - since the JOBNO for a job may change (re-import to another project, saving of a copy, etc.).
Are you looking at something like an Oracle Sequence, but driven from DataStage?
Well that is the point but we have to load this same sequence number for three other jobs. I'll give you an example with sample data to give u a clearer understanding of what we're trying to achieve.

We have Table 1 and Table 2 which are loaded in the same job. Each of these tables share one common key which has to be a unique identifier/sequence number. I'll give u an example of the table data. It
looks like below:

ID COL1 COL2 COL3
----------------------------
1 A B C
2 D E F
3 G H I
4 J K L


ID COL
-----------
1 A
1 B
1 C
2 D
2 E
2 F

The above two tables are inserted in multiple jobs and the ID column should be uniquely generated across jobs.

Posted: Fri Feb 02, 2007 9:21 am
by ArndW
PX has a surrogate key stage; but it generates unique keys only within the job itself. But you can specify the starting value dynamically, i.e. through a parameter to the job. If you can think of an algorithm that ensures each job only uses a given range of numbers you can use that method. The generated key can be up to an int*64 so that gives you a lot of numbers to use.

Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance.

Posted: Fri Feb 02, 2007 3:12 pm
by ray.wurlod
Will these jobs be running consecutively or simultaneously?

Posted: Sun Feb 04, 2007 7:44 am
by Havoc
ray.wurlod wrote:Will these jobs be running consecutively or simultaneously? ...
Hi ray,

these jobs will be running simultaneously, still havent found a solution as to how we can go about implementing this. The unique identifer should be a sequence too :(

Posted: Sun Feb 04, 2007 7:52 am
by ArndW
ArndW wrote:...Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance...
Havoc - this is a solution

Posted: Sun Feb 04, 2007 9:17 am
by Havoc
ArndW wrote:
ArndW wrote:...Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance...
Havoc - this is a solution
Hey Arnd,

Just tried out the above method. But got a question..On what key can we join the input data with the Oracle sequence to get the nextval?

Tried this method out with a lookup to an Oracle stage with the SQL:

select TEST_SEQUENCE.nextval from dual;

But this resulted in all 0s in the 'nextval' field in our output sequential file.

:(

Posted: Sun Feb 04, 2007 11:38 am
by ArndW
I'll assume that the sql works from the command line. You need to specify a lookup stage and use sparse lookups; I'm not at work so can't give you the exact syntax but you need to specify a dummy input key column and use that in the SQL (so that DataStage actually performs the query for each row).

Posted: Sun Feb 04, 2007 2:43 pm
by ray.wurlod
The sequence does have to exist and to have been initialized.

Posted: Sun Feb 04, 2007 5:34 pm
by kumar_s
Create a dummy column along with the Nextval, say with the constant value '1' and lookup on that. If you MINVALUE starts with negative, there is a possibility to get 0's, else no.

Posted: Tue Feb 06, 2007 1:19 am
by Havoc
kumar_s wrote:Create a dummy column along with the Nextval, say with the constant value '1' and lookup on that. If you MINVALUE starts with negative, there is a possibility to get 0's, else no. ...
Thanks ArndW and Kumar, this approach works. :) I get your point about the performance overhead but right now there is no other go.