Required Output

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Required Output

Post by balu536 »

Hi all,
I have a requirement which is as defined below.

i/p:

Col1 Col2 DATE Col4
100 01 2009 33
100 01 2008 24
100 01 2004 30
100 01 2002 15
200 01 2010 50
200 01 2008 21
200 01 2005 45
200 01 2003 20
200 01 2001 12
200 01 1999 10

For the above input i need to add two extra columns in such a way that the values are the next consecutive values under DATE column in source for a particular group(based on Col1).For the last records in that particular group,NULLS can be passed to the new added columns.

required o/p resembles as shown below:

Col1 Col2 DATE Col4 Col5 Col6
100 01 2009 2008 2004 33
100 01 2008 2004 2002 24
100 01 2004 2002 NULL 30
100 01 2002 NULL NULL 15
200 01 2010 2008 2005 50
200 01 2008 2005 2003 21
200 01 2005 2003 2001 45
200 01 2003 2001 1999 20
200 01 2001 1999 NULL 12
200 01 1999 NULL NULL 10


Please clarify whether this logic can be achieved with the stage variables in Transformer.If so guide me for the same.

Regards,
Balakrishna
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

Forgot to mention.Col4 and Col5 are the newly added column.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Sort it by key descending and use stage variables.
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

Sort operation is not required as the data in the input file is going to be in sorted order. Will you please explain the way how the stage variables can be used to achieve the logic.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

Sort in reverse order then use 3 stage variables out of which two should be to hold previous two values with a constraint to flush the values in those stage variables if the key value changes .
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Sort IS required as we need the data by descending on your key(s).

Hint: You will need 4 stage variables.

Think about it and write a pseudocode. I will leave it to your thought process.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

This is a bit trickier to implement, since DataStage does not do "read-ahead" of rows, i.e. a given row cannot know what the next one is. Fortunately with stage variables you can store values from previous rows. So in this case you would collect year numbers and values in stage variables until the group change in column 1 occurs, at which time (again, using stage variables) you would build an output string of the format and contents you want. This would include your own column separators and line terminators and you would output the whole thing as one column - going to a sequential file or named pipe with no formatting. This can then be read in with the correct column definitions. The procedure isn't as complicated as the description makes it sound and it has also been explained several times in DSXChange, although mainly in the Server job section.
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

I doubt whether the stage variables work or not.Here we need to get the value under the DATE column for the next 2 consecutive records to pass to the two new columns.Any other method is highly appreciable.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Re: Required Output

Post by Sainath.Srinivasan »

balu536 wrote: Please clarify whether this logic can be achieved with the stage variables in Transformer.
I thought that is what you wanted.

Anyhow, did you give it a try? Do you have the pseudocode of what you want ?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

I think a sort wouldn't work here, because it would change the order of the YEAR columns. A pure DataStage solution with stage variables as mentioned above is possible but complex. Another possibility is to a add a line number column and then sort by descending line number, the you could store "previous" values of years in stage variables.
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

Sainath,
I have a solution with stage variables when sorted in ascending order

i/p:
Col1 Col2 DATE Col4
100 01 2002 15
100 01 2004 30
100 01 2008 24
100 01 2009 33
200 01 1999 10
200 01 2001 12
200 01 2003 20
200 01 2005 45
200 01 2008 21
200 01 2010 50

Stage Variable Utilisation:
-------------------------------
Expression Stage Variable
S2 S3
S1 S2
DATE S1

Column Derivation:
-------------------------------
Value Column Name
S1 Col3
S2 Col4
S3 Col5

This logic works only when the data is in Ascending order.

Can any one explain me for the descending order case?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

That was what we were mentioning.

From me - descending => reverse order. Not descending by date.

Hope this clarifies.
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

Hi Srinivas,
As you are premium poster,i'm unable to view your entire solution posted


Regards,
Balakrishna
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

All I was trying to convey is your solution is what I had in my mind. You may have to add 1 more stage variable.
balu536
Premium Member
Premium Member
Posts: 103
Joined: Tue Dec 02, 2008 5:01 am

Post by balu536 »

yeah one more stage variable is required inorder to perform this operation for each group.

But sainath is there any way to achieve this logic when the data is in descending order of DATE field(i.e the i/p data as i posted in my initial post).
Post Reply