Splitting Single Row into Multiple Rows Based on Columns

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Splitting Single Row into Multiple Rows Based on Columns

Post by iq_etl »

I've browsed around the forums and haven't found a situation like this one. We have each row from an input SQL server table that needs to be split into two based on two columns.

For example, the input row looks like this (code with budgets for a year):
Code.....Year12-13.....Year13-14
303.......60000...........15000

I need to turn that into:
Code.....Year.....Budget
303.......2012.....60000
303.......2013.....15000

Has anyone solved a situation like this?

Thanks!
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

If you gonna have only two years, why don't you split it in the transformer? Two links out of your transformer, one with first year and the other with another year and then join them again? Am I missing something?
Arun
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

Next year, we'll have 'Year14-15', then 'Year 15-16', etc.

I suppose that will mean annual updates to the DS job either way.

Would it be better to add a new split to the Transformer every year, or can this be done in something like the Pivot stage?
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

You mean to say only the value of the years will change and the number of columns will remain the same? What is the indication you will have w.r.t the year?

I'm not sure if pivot can handle this, though the first thought comes into the mind is pivot after reading the requirement :lol:

As you are in 9.1, you can play around the loop variables and try to get it done.
Arun
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

Ha, no sorry about the confusion.

The number of columns from the source table will increase every year.
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

Okay.. Sounds like a kinda dynamic pivot :D

If you not gonna be dealing with a lot of rows, write your records to a file as csv. Read the entire record as one single column. Your ground zero year is 2012(You can even hardcode this value) and add one to it for every iteration to get the year and put a field function to extract the amount from the csv string and keep adding one to it for every iteration to extract the next value.

Your loop can be set to a value of count of ',' in your string (You dont have to add one to the count if you include your CODE column in the source as well)

There might be better ways, but If I was you, I will do this.
Arun
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

Between our responses, I went with your first suggestion and have splitted them out at the transformer. Using Peek stages, the rows look correct. Should I use Join or Merge after this? Also, is it a good practice to have two links go from a Transformer straight to a Join/Merge or should there be some sort of dataset/file between those stages to hold the rows?

Also, I'm going to look at doing this with a loop as you later suggest as that will better suit long term growth. However, I'd like to get these initial two years loaded as quick as possible.
arunkumarmm
Participant
Posts: 246
Joined: Mon Jun 30, 2008 3:22 am
Location: New York
Contact:

Post by arunkumarmm »

You have the same metadata from both the links right? Use funnel.
Arun
iq_etl
Premium Member
Premium Member
Posts: 105
Joined: Tue Feb 08, 2011 9:26 am

Post by iq_etl »

Perfect! Thanks for the tips!
Post Reply