Sum of n number of rows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Sum of n number of rows

Post by anu123 »

Hello,

My requirement is, I get data from a table with approx 75,000 rows. Right now I need to separate them into chunks of 10 rows. I have 2 columns which must be summed for every 10 rows based on the sum I need to generate another row which makes the total sum of the previous 10 rows zero.

My approach,

I have used stage variables to divide them into chunks of rows with the help of @inrownum and @outrownum and some math functions. 1st row will be header and remaining 9 rows detail. I need to sum up all the rows until I get the next header.


After that I have used stage variables to calculate the sum of 10 rows, added new 2 columns to have their values present. Next step i have used used remove duplicates to remove all the records and retained the last record, so that I can funnel that record with the 10 records and makes it a chunk of 11 records.
e.g.;
1 row - header
2 row - detail

9 row -detail
10 row - header

transforms to
1 row - header
2 row - detail

9 row - detail
10 row - new offset records (which has 2 columns with values that will make the sum of previous nine rows + the new offset column value = 0 )
10 row - header

How can I do this in transformer itself instead of using remove duplicates adding additional columns for summation and making 2 copies of data
removing duplicates retaining the last value and funneling it out to the sequential file?

I am aligning it right now with a unique number for every 10 set of rows so that the chunk of rows remain.
Thank you,
Anu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Might be a good candidate for using Transformer stage looping. Count from 1 to 10, initialize or increment the loop variable that's keeping the ongoing sum.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

I am able to keep the ongoing sum. But after summation, I need to introduce a new row at 11th position which makes the total of 11 rows = 0

e.g.; consider sum of 10 rows to be 100, the value in the 11th row should be -100. And if the sum of next 10 rows from 12th position to 21st postion is -50 the 22nd position should have a value of +50
Thank you,
Anu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's no explicit mention of an eleventh row in your original question, which is probably why it was not addressed in my answer.

However, you could create a second output link that outputs only the negative of the total when the rowcount is 10. Use any appropriate stage type downstream to combine the two streams.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Thanks for the reply
Mod(@INROWNUM,10) = StageVar --
If StageVar = 0 Or @INROWNUM =1 Then 'H' Else 'D'
Above is my current logic for assigning header and detail for every 10 records, it works for all records but not in the case of first 10 records
I need to uniformly divide them
right now my headers position are at 1,10,20,30 .. so on

I am using below logic to assign unique number for every 10 rows so that I can use them for future summation as we talked above
@INROWNUM/10

Can you suggest?
Thank you,
Anu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Code: Select all

Mod(@INROWNUM,10) = StageVar -- 
Header/Detail: If StageVar = 0 Or @INROWNUM =1 Then 'H' Else 'D' 
GroupNumber: If StageVar = 0 Then GroupNumber+1 Else GroupNumber
Initial Value of GroupNumber = 1
-craig

"You can never have too many knives" -- Logan Nine Fingers
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Thanks for the reply.
Thank you,
Anu
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Hello,

My requirement is right now I am assigning a unique number for every 500 numbers with the logic @INROWNUM/500.

But the requirement changed so that, I could accomodate this with the change of date or for every 500 numbers.

If there are 100 rows with 100 different posting dates I need to create 100 headers.

If I have 1250 rows with two different posting dates, for example one posting date has 650 records and the second posting date has 600 records

It needs to be divided like,

The first posting date that has 650 records will be divided into 500 records of 1 batach and 150 records of another batch


The next sposting date should be divided in the same fashion in chunck of 500 and 100. Please suggest
Thank you,
Anu
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Everything that's already been suggested still applies, it just needs to be... adjusted to fit. You just have two things to check now rather than one which doesn't really complicate it all that much more.

Meaning... your change is not a fundamental "gotta go back to the drawing board" kind of change, more of a minor variation on the existing theme.
-craig

"You can never have too many knives" -- Logan Nine Fingers
anu123
Premium Member
Premium Member
Posts: 143
Joined: Sun Feb 05, 2006 1:05 pm
Location: Columbus, OH, USA

Post by anu123 »

Thanks,

I was able to batch them, based on both the conditions
Thank you,
Anu
Post Reply