Sequence Generator

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
tunde
Participant
Posts: 11
Joined: Thu Jun 05, 2003 7:43 am
Location: USA

Sequence Generator

Post by tunde »

Can someone help me out on this. i want to assign surrogate keys ( unique sequence keys ) to my table. How do i go about this. Ds does not seem to have a sequence generator within a parallel Job.

Again, what is the difference btw a parallel job and a server job and in what scenerios can they be used?

Tunde
tonystark622
Premium Member
Premium Member
Posts: 483
Joined: Thu Jun 12, 2003 4:47 pm
Location: St. Louis, Missouri USA

Post by tonystark622 »

tunde,

There is a transform under TransformsSDKKeyMgt called KeyMgtGetNextValueConcurrent

I can't see why this wouldn't work on a parallel job.

Good Luck,
Tony
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sometime soon (as soon as we get legal clearance from ASCL) a White Paper on the differences between server jobs and parallel jobs will be published on the DataStageXChange (www.datastagexchange.com).
The main difference is the execution environment. Server jobs execute typically as single streams in single sets of processes on the server. Parallel jobs take advantage of Parallel Extender (the name ASCL gave to the Orchestrate technology they acquired with Torrent Systems last year) to allow simultaneous execution on multiple "processing nodes" - multiple CPUs in an SMP environment, or multiple notes in an MPP/cluser environment. Absolutely huge volumes of data can be processed in this way (see ASCL web site for benchmark information).

Not sure about what you mean in your assertion that "Ds does not seem to have a sequence generator within a parallel job".
You can use the Row Generator stage to generate rows, you still have access to system variables @INROWNUM and @OUTROWNUM for column derivation and output constraint expressions.
If you mean a job sequence generator, this is neither part of a server job nor a parallel job; it's a third kind of job (you can use File > New > Job Sequencer. A job sequencer can start either server or mainframe jobs, or even a mix of the two.

Transforms are not available in parallel jobs, since Transforms encapuslate BASIC expressions, and parallel jobs do not execute in a BASIC environment. Routines similarly are unavailable in parallel jobs, for the same reason. A different set of functions than those used in server jobs is provided for use in parallel jobs.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
nelc
Participant
Posts: 23
Joined: Wed Apr 16, 2003 4:34 am

Post by nelc »

Another add-on query relating to this thread:
@INROWNUM and @OUTROWNUM is with respect to the total number of the source record. Is there any way to group the sequence number such that it restarts with every new ID?

Example:
ID SEQ
1 1
1 2
1 3
2 1
2 2
2 3

Any help is much appreciated! [:)]
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No, it's a strict sequence number counting the rows coming in on the input link or going out on an output link.
Why do you want to re-start the count? This is not the strict definintion of surrogate key (which only provides uniqueness, and has no inherent meaning).
You can do it, but instead of using system variables you use stage variables in a Transformer stage. One variable for the counter (initialize to zero), one to detect the change in whatever it is you're using for "ID", perhaps a third that is true if the ID value has changed, false otherwise. This assumes that the input data to the Transformer stage are sorted, which is easy to do in parallel jobs.
Presumably your data are partitioned on the same "ID" column, so that uniqueness is guaranteed across multiple data sets being processed.
Another white paper, this time on stage variables, is ready for publication on DSX; it, too, awaits legal clearance from Ascential.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
MAT
Participant
Posts: 65
Joined: Wed Mar 05, 2003 8:44 am
Location: Montréal, Canada

Post by MAT »

Hi Tunde,

PX has a stage called Column Generator, we used it for generating our surrogate keys. You can even make it assign keys in parallel. Since you can't use a routine, this was the best we found.

MAT
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Ah, learned something today! :)
(I'm still in the process of learning all the things that parallel jobs can do.)
nelc
Participant
Posts: 23
Joined: Wed Apr 16, 2003 4:34 am

Post by nelc »

Thanks Ray for the advice on the group seq generator.

I tried what you suggested by implementing the following stage variables:

Var Derivation
CheckID RowProcCompareWithPreviousValue(input.ID)
Seq If (CheckID) then (Seq+1) else 0

However I still get errors with unique key constraints as this seq no and ID is a composite key of the table.
Is there problem with my derivation column? Is the above implementation as what you had explained?

Thanks again! [:)]
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Your Seq is generating lots of zeroes.
Should its derivation have been If CheckID Then (Seq + 1) Else Seq ?
Post Reply