Page 1 of 2
Generating Sequence numbers in Parallel Transormer
Posted: Mon Nov 19, 2007 2:43 pm
by srimitta
Hi All,
Scenario 1. Generating Sequence numbers by using Surrogate key option with NextSurrogateKey() utility in Parallel Transform stage.
Scenario 2. calling System varibales @OUTROWNUM in output derivation.
in both Scenario's DataStage job is generating even numbers, I know this is because of more than one partition.
Is there-a-way to generate sequence numbers with-in the Transformer stage, other than using Surrogate Key Generator stage itself.
Any idea is great
Thanks
srimitta
Posted: Mon Nov 19, 2007 2:46 pm
by ray.wurlod
Use a stage variable initialized to the partition number (or, perhaps, (next available key value + partition number)) and incrementing by partition count. There are system variables that yield these partition values.
Posted: Mon Nov 19, 2007 2:54 pm
by srimitta
Thanks Ray for te response.
@PARTITIONNUM
@INROWNUM
@NUMPARTITIONS
@OUTROWNUM
Can you help me in how to use these varibales in Transformer stage to get right sequence.
Thanks
Srimitta
Posted: Mon Nov 19, 2007 2:56 pm
by srimitta
You mean NextSurrogateKey() + @PARTITIONNUM
Posted: Mon Nov 19, 2007 2:58 pm
by srimitta
Sorry for reapeted posts,
You mean (NextSurrogateKey() + @PARTITIONNUM) + @NUMPARTITIONS
Posted: Mon Nov 19, 2007 4:10 pm
by srimitta
Thanks Ray I got it working
In output derivation
(NextSurrogateKey() + @PARTITIONNUM) / @NUMPARTITIONS
Posted: Mon Nov 19, 2007 4:15 pm
by srimitta
Oops I could generate sequence numbers from one partition only,
any
![Idea :idea:](./images/smilies/icon_idea.gif)
what changes do I need to print right sequnce.
Posted: Mon Nov 19, 2007 5:14 pm
by ray.wurlod
What is occurring on the other partitions?
Posted: Mon Nov 19, 2007 5:26 pm
by srimitta
Transformer input link is reading 7707 rows and writing same number.of (7707) rows into DataSet, but when I try to view data I could see only 50% of total input records.
I checked job log for errors, didn't find any.
Do you think CODE is wrong! or am I missing something here.
Thanks
srimitta
Posted: Mon Nov 19, 2007 7:32 pm
by ray.wurlod
Use the Data Set Management tool to determine how many rows are actually stored on each partition (node).
Posted: Tue Nov 20, 2007 9:49 am
by srimitta
Code: Select all
Partitions:
# Node Records Blocks Bytes
0 node1 3609 38 4893804
1 node2 3703 39 5021268
When I try to view data from Dataset Management it's taking forever.
Any idea if I have to view all rows in DataSet what is that I need to look into.
Thanks
srimita
Posted: Tue Nov 20, 2007 10:13 am
by srimitta
And one more thing is Data in DataSet not in Sequence order it starts from 1 and after 96 again row starts from 3704 and ends with 3799 and sratrs from 97 ends with 192.
Sequence no changes app after every 95 rows.
Any idea hwat's going-on and how to staight-up this.
Thanks
srimitta
Posted: Tue Nov 20, 2007 11:06 am
by laxmi_etl
Posted: Tue Nov 20, 2007 1:08 pm
by rwierdsm
srimitta wrote:Thanks Ray I got it working
In output derivation
(NextSurrogateKey() + @PARTITIONNUM) / @NUMPARTITIONS
Yikes! What happens when the DS admin guys decide to change the number of partitions in production?
Rob
Posted: Tue Nov 20, 2007 2:26 pm
by srimitta
It would be great if you can you help me if you have a better approach.
Thanks
srimitta