Page 1 of 2

Generating Sequence numbers in Parallel Transormer

Posted: Mon Nov 19, 2007 2:43 pm
by srimitta
Hi All,

Scenario 1. Generating Sequence numbers by using Surrogate key option with NextSurrogateKey() utility in Parallel Transform stage.

Scenario 2. calling System varibales @OUTROWNUM in output derivation.

in both Scenario's DataStage job is generating even numbers, I know this is because of more than one partition.

Is there-a-way to generate sequence numbers with-in the Transformer stage, other than using Surrogate Key Generator stage itself.

Any idea is great :idea:

Thanks
srimitta

Posted: Mon Nov 19, 2007 2:46 pm
by ray.wurlod
Use a stage variable initialized to the partition number (or, perhaps, (next available key value + partition number)) and incrementing by partition count. There are system variables that yield these partition values.

Posted: Mon Nov 19, 2007 2:54 pm
by srimitta
Thanks Ray for te response.

@PARTITIONNUM
@INROWNUM
@NUMPARTITIONS
@OUTROWNUM

Can you help me in how to use these varibales in Transformer stage to get right sequence.

Thanks
Srimitta

Posted: Mon Nov 19, 2007 2:56 pm
by srimitta
You mean NextSurrogateKey() + @PARTITIONNUM

Posted: Mon Nov 19, 2007 2:58 pm
by srimitta
Sorry for reapeted posts,

You mean (NextSurrogateKey() + @PARTITIONNUM) + @NUMPARTITIONS

Posted: Mon Nov 19, 2007 4:10 pm
by srimitta
Thanks Ray I got it working :)

In output derivation
(NextSurrogateKey() + @PARTITIONNUM) / @NUMPARTITIONS

Posted: Mon Nov 19, 2007 4:15 pm
by srimitta
Oops I could generate sequence numbers from one partition only,
any :idea: what changes do I need to print right sequnce.

Posted: Mon Nov 19, 2007 5:14 pm
by ray.wurlod
What is occurring on the other partitions?

Posted: Mon Nov 19, 2007 5:26 pm
by srimitta
Transformer input link is reading 7707 rows and writing same number.of (7707) rows into DataSet, but when I try to view data I could see only 50% of total input records.

I checked job log for errors, didn't find any.

Do you think CODE is wrong! or am I missing something here.

Thanks
srimitta

Posted: Mon Nov 19, 2007 7:32 pm
by ray.wurlod
Use the Data Set Management tool to determine how many rows are actually stored on each partition (node).

Posted: Tue Nov 20, 2007 9:49 am
by srimitta

Code: Select all

Partitions:
#       Node             Records         Blocks     Bytes
0       node1            3609              38           4893804
1       node2            3703              39           5021268
When I try to view data from Dataset Management it's taking forever.

Any idea if I have to view all rows in DataSet what is that I need to look into.

Thanks
srimita

Posted: Tue Nov 20, 2007 10:13 am
by srimitta
And one more thing is Data in DataSet not in Sequence order it starts from 1 and after 96 again row starts from 3704 and ends with 3799 and sratrs from 97 ends with 192.

Sequence no changes app after every 95 rows.

Any idea hwat's going-on and how to staight-up this.

Thanks
srimitta

Posted: Tue Nov 20, 2007 11:06 am
by laxmi_etl

Posted: Tue Nov 20, 2007 1:08 pm
by rwierdsm
srimitta wrote:Thanks Ray I got it working :)

In output derivation
(NextSurrogateKey() + @PARTITIONNUM) / @NUMPARTITIONS
Yikes! What happens when the DS admin guys decide to change the number of partitions in production?

Rob

Posted: Tue Nov 20, 2007 2:26 pm
by srimitta
It would be great if you can you help me if you have a better approach.

Thanks
srimitta