Generating Sequence numbers in Parallel Transormer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

srimitta wrote:And one more thing is Data in DataSet not in Sequence order it starts from 1 and after 96 again row starts from 3704 and ends with 3799 and sratrs from 97 ends with 192.

Sequence no changes app after every 95 rows.

Any idea hwat's going-on and how to staight-up this.

Thanks
srimitta
All the numbers are there - what you are seeing is an artifact of how the numbers are blocked, and possibly of how you are sampling in View Data. It looks like you are getting approximately 96 rows per block (the actual number depends, of course, on the row size).

The values are not necessarily stored in sorted order in the Data Set - did you sort them on the way in?

Even if they are, as you retrieve rows from the different processing nodes it will appear that there are huge jumps. Depending on how your data are partitioned will also affect how big these jumps seem to be. For example with Round Robin partitioning and two nodes you will tend to get even numbers on one node and odd numbers on the other.

I exhort you to experiment further with the variations on this theme, and try to understand what's happening and what's going where.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rwierdsm
Premium Member
Premium Member
Posts: 209
Joined: Fri Jan 09, 2004 1:14 pm
Location: Toronto, Canada
Contact:

Post by rwierdsm »

We tried a number of different options in parallel, but none could give us an acceptable result.

In the end, we created a sequetial process to generate our surrogate keys.

Rob
Rob Wierdsma
Toronto, Canada
bartonbishop.com
dohertys
Participant
Posts: 39
Joined: Thu Oct 11, 2007 3:26 am
Location: Sheffield

Post by dohertys »

The way I found recommended in the documentation was to use a Surrogate Key Generator stage, with 'Execution Mode' set to sequential
and 'Collector Type' to round robin.

Any use?
boxtoby
Premium Member
Premium Member
Posts: 138
Joined: Mon Mar 13, 2006 5:11 pm
Location: UK

Post by boxtoby »

I have used this derivation in the past for a surrogate key:

@PARTITIONNUM+1 : @INROWNUM

I wouldn't recommend it for a permenant key value, but it works well for temporary storage in a dataset.
Bob Oxtoby
srimitta
Premium Member
Premium Member
Posts: 187
Joined: Sun Apr 04, 2004 7:50 pm

Post by srimitta »

We took workaround approach by forcing source and lookup DataSet's created on same node, now surrogate keys are in sequence.

Thanks
srimitta
Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives.
By William A.Foster
srimitta
Premium Member
Premium Member
Posts: 187
Joined: Sun Apr 04, 2004 7:50 pm

Post by srimitta »

We took workaround approach by forcing source and lookup DataSet's created on same node, now surrogate keys are in sequence in Parallel mode.

Thanks
srimitta
Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives.
By William A.Foster
Post Reply