Sequence or Unique identifer across tables in PX ?

Havoc · Post by **Havoc** » Fri Feb 02, 2007 3:00 am

Hi ,

I have a scenario in which a unique identifier or a sequence number must be generated across jobs i.e. there are multiple jobs inserting on the same table and should insert the next sequence number/unique identifier per job. This unique identifier should then be passed and be inserted in TWO other tables.Tables are residing in an Oracle database.

Can someone please provide a solution/method on how one can go about this ?

kduke · Post by **kduke** » Fri Feb 02, 2007 7:02 am

Why not use JOBNO? This is stored in DS_JOBS. You can easily look it up based on the job name.

Havoc · Post by **Havoc** » Fri Feb 02, 2007 7:52 am

kduke wrote:Why not use JOBNO? This is stored in DS_JOBS. You can easily look it up based on the job name.

How can we leverage the JOBNO to create a unique identifier?
One more query is this JOBNO dynamic or static?

Basically there are multiple jobs and we have to avoid clashes when generating this unique identifer for this ONE table and pass same value to TWO other tables per Job.

ArndW · Post by **ArndW** » Fri Feb 02, 2007 7:56 am

I think using job name is better - since the JOBNO for a job may change (re-import to another project, saving of a copy, etc.).
Are you looking at something like an Oracle Sequence, but driven from DataStage?

Havoc · Post by **Havoc** » Fri Feb 02, 2007 9:08 am

ArndW wrote:I think using job name is better - since the JOBNO for a job may change (re-import to another project, saving of a copy, etc.).
Are you looking at something like an Oracle Sequence, but driven from DataStage?

Well that is the point but we have to load this same sequence number for three other jobs. I'll give you an example with sample data to give u a clearer understanding of what we're trying to achieve.

We have Table 1 and Table 2 which are loaded in the same job. Each of these tables share one common key which has to be a unique identifier/sequence number. I'll give u an example of the table data. It
looks like below:

ID COL1 COL2 COL3
----------------------------
1 A B C
2 D E F
3 G H I
4 J K L

ID COL
-----------
1 A
1 B
1 C
2 D
2 E
2 F

The above two tables are inserted in multiple jobs and the ID column should be uniquely generated across jobs.

ArndW · Post by **ArndW** » Fri Feb 02, 2007 9:21 am

PX has a surrogate key stage; but it generates unique keys only within the job itself. But you can specify the starting value dynamically, i.e. through a parameter to the job. If you can think of an algorithm that ensures each job only uses a given range of numbers you can use that method. The generated key can be up to an int*64 so that gives you a lot of numbers to use.

Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance.

ray.wurlod · Post by **ray.wurlod** » Fri Feb 02, 2007 3:12 pm

Will these jobs be running consecutively or simultaneously?

Havoc · Post by **Havoc** » Sun Feb 04, 2007 7:44 am

ray.wurlod wrote:Will these jobs be running consecutively or simultaneously? ...

Hi ray,

these jobs will be running simultaneously, still havent found a solution as to how we can go about implementing this. The unique identifer should be a sequence too

ArndW · Post by **ArndW** » Sun Feb 04, 2007 7:52 am

ArndW wrote:...Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance...

Havoc - this is a solution

Havoc · Post by **Havoc** » Sun Feb 04, 2007 9:17 am

ArndW wrote:
ArndW wrote:...Using a reference lookup to a unique Oracle sequence is another option, but it will severely impact performance...
Havoc - this is a solution

Hey Arnd,

Just tried out the above method. But got a question..On what key can we join the input data with the Oracle sequence to get the nextval?

Tried this method out with a lookup to an Oracle stage with the SQL:

select TEST_SEQUENCE.nextval from dual;

But this resulted in all 0s in the 'nextval' field in our output sequential file.

ArndW · Post by **ArndW** » Sun Feb 04, 2007 11:38 am

I'll assume that the sql works from the command line. You need to specify a lookup stage and use sparse lookups; I'm not at work so can't give you the exact syntax but you need to specify a dummy input key column and use that in the SQL (so that DataStage actually performs the query for each row).

ray.wurlod · Post by **ray.wurlod** » Sun Feb 04, 2007 2:43 pm

The sequence does have to exist and to have been initialized.

kumar_s · Post by **kumar_s** » Sun Feb 04, 2007 5:34 pm

Create a dummy column along with the Nextval, say with the constant value '1' and lookup on that. If you MINVALUE starts with negative, there is a possibility to get 0's, else no.

Havoc · Post by **Havoc** » Tue Feb 06, 2007 1:19 am

kumar_s wrote:Create a dummy column along with the Nextval, say with the constant value '1' and lookup on that. If you MINVALUE starts with negative, there is a possibility to get 0's, else no. ...

Thanks ArndW and Kumar, this approach works.

I get your point about the performance overhead but right now there is no other go.