How to calculate running totals (cumulative) based on a col?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

OK. With that out of the picture we'll need details of how you are sorting to try and see what the issue is. For example, when using the Sort stage when settings did you use?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bond88
Participant
Posts: 109
Joined: Mon Oct 15, 2012 10:05 am
Location: USA

Post by bond88 »

I used "order by" clause in source query.

Select ..........
from ......
where ....
group by project_id, accounting_period, category, category_name
order by project_id, accounting_period, category, category_name

In sort stage

I am defining each one of the Project_id, accounting_period, category and category_name as KEY,
sort key mode = sort
sort order = ascending

and default options
allow duplicates = true
create cluster key change column = false
create key change column = false
output statistics = false
sort utility = datastage
stable sort = true

Please let me know if I missed anything.

Thanks,
Bhanu
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

Once again, why not try doing it all in the query?

Code: Select all

SELECT SUM(AMT) OVER (PARTITION BY ID ORDER BY CODE) FROM TABNAME
ought to give you the expected result in fairly quick time. OLAP functions provided by database vendors are very efficient and if you have the choice, go ahead and use them. We had a similar situation and the OLAP based query returned results almost 3 times as fast as DataStage did.
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
bond88
Participant
Posts: 109
Joined: Mon Oct 15, 2012 10:05 am
Location: USA

Post by bond88 »

I am getting running totals continuously. I want totals separately for each project. Can anyone suggest how can I achieve it?

Thank you,
Bhanu
bond88
Participant
Posts: 109
Joined: Mon Oct 15, 2012 10:05 am
Location: USA

Post by bond88 »

I am getting running totals continuously. I want totals separately for each project. Can anyone suggest how can I achieve it?

In detail

Project_id----year----account_period----category----amount

1 2013 1 10 150
1 2013 1 20 280
1 2013 2 10 200
1 2013 2 20 450
2 2013 1 10 100
2 2013 1 20 520
2 2013 2 10 120
2 2013 2 20 80

I am getting output like this

Project_id----year----account_period----category----amount----cumulative

1 2013 1 10 150 150
1 2013 1 20 280 280
1 2013 2 10 200 350 (200+150)
1 2013 2 20 450 730 (450+280)
2 2013 1 10 100 450 (100+350)
2 2013 1 20 520 1250 (520+730)
2 2013 2 10 120 570 (120+450)
2 2013 2 20 80 1330 (80+1250)

But I need output like this

Project_id----year----account_period----category----amount----cumulative

1 2013 1 10 150 150
1 2013 1 20 280 280
1 2013 2 10 200 350 (200+150)
1 2013 2 20 450 730 (450+280)
2 2013 1 10 100 100
2 2013 1 20 520 520
2 2013 2 10 120 220 (120+100)
2 2013 2 20 80 600 (80+520)


Thank you,
Bhanu
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Sort by project_id, year, account_period and category in a Sort stage, generating a key change column.

In Transformer stage derive stage variable svRunningTotal value as follows.

Code: Select all

If keyChange = 1 Then amount Else svRunningTotal + amount
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sonal
Premium Member
Premium Member
Posts: 8
Joined: Tue Jun 07, 2005 1:10 am
Location: India

Post by sonal »

You could use the loop function in transformer, but you need to sort the data in datastage and partition it according to the key values.The order by caluse of database , need not result in ordered data within each partition
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As noted, you're just not sorting it properly to match your stated requirements.
-craig

"You can never have too many knives" -- Logan Nine Fingers
bond88
Participant
Posts: 109
Joined: Mon Oct 15, 2012 10:05 am
Location: USA

Post by bond88 »

Thanks Chulett,
It was the problem with transformer stage option "sequential". I changed everywhere expect in transformer thats why I was getting errors. Now the thing is I am getting running totals for each code seperately but continously for all projects, but I want separate running totals for each project. Ray suggested something to use key change column from sort stage but I want to get running totals separately for each project and as well as category. Could you suggest me a way to achieve it ?

Thank you,
Bhanu
bond88
Participant
Posts: 109
Joined: Mon Oct 15, 2012 10:05 am
Location: USA

Post by bond88 »

Thanks Ray,
I need to populate running totals for each category (like 10, 20, 30, 40......) and for each project (project_id's). Already I am using below logic to get running totals but I am not sure how could I add key column change to this. Could you please give a suggestion?

Logic I am using in transformer stage:

TEMP1 ---- IF DSLink21.CATEGORY='10' THEN TEMP1+DSLink21.BUDGET_AMOUNT ELSE TEMP1

TEMP2 ---- IF DSLink21.CATEGORY='20' THEN TEMP2+DSLink21.BUDGET_AMOUNT ELSE TEMP2

TEMP3 ---- IF DSLink21.CATEGORY='30' THEN TEMP3+DSLink21.BUDGET_AMOUNT ELSE TEMP3



TEMP ---- if (DSLink21.CATEGORY='10')
Then TEMP1
Else If (DSLink21.CATEGORY='20')
Then TEMP2
Else TEMP3


Assigning this TEMP to output link.

How could I use another IF here (key change column)?

Thank you,
Bhanu
bond88
Participant
Posts: 109
Joined: Mon Oct 15, 2012 10:05 am
Location: USA

Post by bond88 »

Thanks Ray,
By generating key column change column in sort stage could achieve this task.
Bhanu
Post Reply