Transformation Logic

prasson_ibm · Post by **prasson_ibm** » Mon Jul 13, 2009 4:21 am

Hi All,
i have a source dataset in which columns are two like this

Col1,Col2
1,A
1,B
1,C
and i want to develop some logic in transformer,so that my target data will be like this:-
Col1,Col2
2,A
3,B
4,C
Can anyone help me to do this using stage veriable..???or is there any other way out to do this..??

priyadarshikunal · Post by **priyadarshikunal** » Mon Jul 13, 2009 5:18 am

have you tried searching the forum?

stage variables is not a good option unless you have to reset the counter depending on keys on key partitioned data.

If you just want to generate a sequence number use surrogate key generator stage or surrogate key functionality of transformer.

prasson_ibm · Post by **prasson_ibm** » Mon Jul 13, 2009 5:31 am

priyadarshikunal wrote:have you tried searching the forum?

stage variables is not a good option unless you have to reset the counter depending on keys on key partitioned data.

If you just want to generate a sequence number use surrogate key generator stage or surrogate key functionality of transformer.

Yes i want counter that should be incremented by 1 for every new records and starting value will be COL1 value +1

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Mon Jul 13, 2009 6:04 am

Run the data through a sort stage and add 1 in a transformer.

Note - this is the idea for single node run.

For multiple node, you must partition the data by proper key columns and add (partition number * incrementalStageVariable).

Search for partition number to get more information.

prasson_ibm · Post by **prasson_ibm** » Mon Jul 13, 2009 6:31 am

Sainath.Srinivasan wrote:Run the data through a sort stage and add 1 in a transformer.

Note - this is the idea for single node run.

For multiple node, you must partition the data by proper key columns and add (partition ...

can you please tell me the logic for adding 1 in transformer..??

priyadarshikunal · Post by **priyadarshikunal** » Mon Jul 13, 2009 6:54 am

Create a stage variable say svCount intialize it to 0.

in derivation put

Code: Select all

if svCount=0 then col1+1 else svCount+1

reiterating words from my previous post also what sainath told that if you want to increment that value for each record them its best to use surrogate key.

prasson_ibm · Post by **prasson_ibm** » Mon Jul 13, 2009 7:08 am

priyadarshikunal wrote:Create a stage variable say svCount intialize it to 0.

in derivation put
Code: Select all
if svCount=0 then col1+1 else svCount+1
reiterating words from my previous post also what sainath told that if you want to increment that value for each record them its best to use surrogate key.

Thanks a lot........it resolved my requirment...

priyadarshikunal · Post by **priyadarshikunal** » Mon Jul 13, 2009 7:11 am

Then time to mark this post as Resolved.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Mon Jul 13, 2009 7:14 am

Priyadharshi,

Just a note - if the incoming data has duplicates, then using surrogate key stage may only act as a column generator. The op will then have to split the row for unique and duplicate values.

So it may be better to use stage variables with key change in sort.

priyadarshikunal · Post by **priyadarshikunal** » Mon Jul 13, 2009 9:36 am

Thanks for reminding me.

I just wanted to give a straight answer as the topic's question is quite broad/generic and there can be a lot of concequences based of data and nature of requirement. No one can give a perfect answer unless requirements are crystal clear else have to guess.

Hope the poster knows exactly what he/she wants.