Rollup Data

samit_9999 · Post by **samit_9999** » Fri Jul 14, 2006 2:53 pm

Hi ,

I have a Table with such kind of info

Region Dept Emp Sal
South 10 Sam 10
South 10 Tam 20
South 20 Pam 40
North 30 Cam 50
North 40 Mam 100
North 40 Dam 50

I need information Rolled up at both Region and Dept Level
Total Salary for Region South is 70 Dept 10 = 30 Dept 20=40
Total Salary for Region North is 200 Dept 30 = 50 Dept 40 = 150

I need to generate a file through DataStage job with the following output

Region Dept Dep_Sal_Total Region_Sal_Total
South 10 30 70
South 20 40 70
North 30 50 200
North 40 150 200

Backend is TeraData.

Please help!!

Krazykoolrohit · Post by **Krazykoolrohit** » Fri Jul 14, 2006 3:24 pm

I can suggest you two solutions:

1. write a SQL query in the ODBC stage aggregating your columns and sending the output to the new columns. something like:

select Region,Dept,Dep_Sal_Total,Region_Sal_Total

from <table>

where

Dep_Sal_Total in (select dept,Sum(Sal) from <table> group by dept)
and Region_Sal_Total in (select region,Sum(Sal) from <table> group by sal)

group by Dep_Sal_Total,Region_Sal_Total

Add Region,Dept,Dep_Sal_Total,Region_Sal_Total in the columns definition.

Use the query as a hint. I am unable to test and run the query for you right now.

2. use an aggregator stage if you want to sum on individual columns. use a transformer in the end to organize the output and drop the rows which you dont want.

For more inputs of aggregator reer to the documentation available with installation

DSguru2B · Post by **DSguru2B** » Fri Jul 14, 2006 3:25 pm

do it in the sql itself

Code: Select all

select A.Region, A.Dept, Sum(A.Sal) As Dep_Sal_Total, Temp.Sal
from table A
inner join (select B.Region,sum(B.Sal)As Sal from table B group by Region) as Temp
on A.Region = Temp.Region
group by A.Region, A.Dept, Temp.Sal

table A, table B both are the same table.

DSguru2B · Post by **DSguru2B** » Fri Jul 14, 2006 3:28 pm

Ok what just happened to my post

Krazykoolrohit · Post by **Krazykoolrohit** » Fri Jul 14, 2006 3:32 pm

SOS roy.

chulett · Post by **chulett** » Fri Jul 14, 2006 3:43 pm

How did you guys manage this? Now we're all stuck inside Krazykoolrohit's world... help!

DSguru2B · Post by **DSguru2B** » Fri Jul 14, 2006 6:24 pm

I know, tell me about it.
Roy...Need you. Roy

ray.wurlod · Post by **ray.wurlod** » Sat Jul 15, 2006 4:38 am

Walter, more likely, our dear webmaster, is the one to fix this. Looks like the message board software has gone somewhat agley.

samit_9999 · Post by **samit_9999** » Mon Jul 17, 2006 8:30 am

Thank you DSGuru.

However the problem is much bigger.

I am selecting around 60 columns from some 4-5 Dimensional Views and i am almost at the rim of hitting the 64 Join Parser Limit.So i do not really wanna join to the same view once again .

Is there any option through DataStage that i can achieve the same results?

Krazykoolrohit · Post by **Krazykoolrohit** » Mon Jul 17, 2006 8:45 am

why dont you do all your joins at the database side and populate a final view. use datastage to extract data from there. would be much faster

DSguru2B · Post by **DSguru2B** » Mon Jul 17, 2006 8:52 am

Offcourse you can. Break the query into two.
Query1:

Code: Select all

select B.Region,sum(B.Sal)As Sal from table B group by Region

Since this query is at the Region level, it will retrieve lesser amount of rows as compared to the second query, load this into hashed file, keyed on Region.

Query2:

Code: Select all

select A.Region, A.Dept, Sum(A.Sal) As Dep_Sal_Total, Temp.Sal 
from table A 
group by A.Region, A.Dept, Temp.Sal

Query 2 will be your source. Do a lookup to the hashed file.
A NOT(LKUPLINK.NOTFOUND) is equivalent to an inner join

But then again, if datasize is massive, go for KrazyKoolRohit's suggestion.

samit_9999 · Post by **samit_9999** » Mon Jul 17, 2006 9:45 am

Thanks to all of u