Hi,
I am generating unique key in transformer on the basis of @NUMPARTITION and @PARTITIONNUM.
Though it is generating unique keys but the order is random.Cause more columns are passing throgh 1 node.Its a 5 node file.
I have tried RoundRobin partition also.
Nothing happening!!!!!!!
I want to generate consecutive key values.
RANDOM SURROGATE KEY
Moderators: chulett, rschirm, roy
Jayant,
For generating the surrogate key, you can either use the Surrogate Key stage as Balaji said or as you have done, use the Transformer. Both have pros and cons.
If you have to use the surrogate key stage, then you will have to give the start value in the stage properties (i am not sure if this can be parameterised), which is not preferred by many.
Now the transformer method is the most preferred and used. But here there is the possibility that the generated keys might miss out a few number when @NUMPARTITION and @PARTITIONNUM are used, depending on the number of nodes in the configuration file. This cannot be avoided, as some of the nodes might have finished processsing their records, while some others might just be finishing. But the generated keys would be in a sequnce, according to the formula that you have given, except towards the end of the records where a few numbers might be missed out. If you do an order by on the Surrogate key column, then you will see this.
The surrogate key geneneration sequence that I use is -
where SeedVal is the last generated key. The 1 before that is for incrementing the generated keys by 1 for the next record.
Hope this helps.
Regards,
The Bird.
For generating the surrogate key, you can either use the Surrogate Key stage as Balaji said or as you have done, use the Transformer. Both have pros and cons.
If you have to use the surrogate key stage, then you will have to give the start value in the stage properties (i am not sure if this can be parameterised), which is not preferred by many.
Now the transformer method is the most preferred and used. But here there is the possibility that the generated keys might miss out a few number when @NUMPARTITION and @PARTITIONNUM are used, depending on the number of nodes in the configuration file. This cannot be avoided, as some of the nodes might have finished processsing their records, while some others might just be finishing. But the generated keys would be in a sequnce, according to the formula that you have given, except towards the end of the records where a few numbers might be missed out. If you do an order by on the Surrogate key column, then you will see this.
The surrogate key geneneration sequence that I use is -
Code: Select all
@PARTITIONNUM + (@NUMPARTITIONS * (@INROWNUM - 1)) + 1 + SeedVal
Hope this helps.
Regards,
The Bird.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Surrogate keys are ONLY about uniqueness. There is no requirement that they form an unbroken sequence. If there are gaps, this does not detract from the fact that each is unique. Change your mindset.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Charter Member
- Posts: 166
- Joined: Wed Mar 16, 2005 6:52 am
- Location: Mumbai, India
The randomness witnessed is because your target stage reads data from all(5 ?) nodes and writes the data sequentially in the order it receives it from those nodes. It does not sort the data.
As stated by Ray above, the keys will be unique. The round robin partitioning will minimize gaps but will not gaurantee an orderly write. you will still see randomness in the key order.
HTH
As stated by Ray above, the keys will be unique. The round robin partitioning will minimize gaps but will not gaurantee an orderly write. you will still see randomness in the key order.
HTH
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
unique columns values in parallel transformer stage
Hi
I tried to generate the column in parallel transfomr stage with unique values using stage variables .
here was my defination for stage variable
1) Create a stage variable for the counter, eg. SVCounter.
2) At the Stage Properties form set the Initial Value of the Stage Variable to "@PARTITIONNUM - @NUMPARTITIONS + 1".
3) Set the derivation of the stage variable to "svCounter + @NUMPARTITIONS". You can embed this in an IF statement if it is a conditional counter.
which i got from one of members in this forum ... thanks
but still i get error saying that
" dupliate identifier"
can you guys help me out.
I tried to generate the column in parallel transfomr stage with unique values using stage variables .
here was my defination for stage variable
1) Create a stage variable for the counter, eg. SVCounter.
2) At the Stage Properties form set the Initial Value of the Stage Variable to "@PARTITIONNUM - @NUMPARTITIONS + 1".
3) Set the derivation of the stage variable to "svCounter + @NUMPARTITIONS". You can embed this in an IF statement if it is a conditional counter.
which i got from one of members in this forum ... thanks
but still i get error saying that
" dupliate identifier"
can you guys help me out.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: