Counts and sums in Datastage with out 'Aggregator' stage

anu123 · Post by **anu123** » Fri Jan 19, 2007 2:51 pm

'Auto' the default one.

DSguru2B wrote:In the transformer, go to the stage properties, what partitioning are you providing?

DSguru2B · Post by **DSguru2B** » Fri Jan 19, 2007 2:54 pm

And thats where you are going wrong. You need to partition it by keys. Do this, provide Hash as partitioning and choose the three keys. Now run your job again. See what happens.

DSguru2B · Post by **DSguru2B** » Fri Jan 19, 2007 3:01 pm

Coming back to the "Why you are not using the Aggregator stage", inconsistant and inaccurate data is not the fault of the aggregator stage. Its the fault of how it was used. Take a small set of data, use the aggregator stage to build a job out of it. Test it out. If you have problems, we are here to help you out. Once you get that working, then feed in a couple of millions. If that works out too then feed in your complete data feed. Prove to "them" that aggregator works just fine. Instead of reinventing the wheel. You do realize that doing all those current row to previous row comparisons will also take some time. Wont be drastically slow but still.

anu123 · Post by **anu123** » Fri Jan 19, 2007 3:08 pm

I got it. thanks Guru.

one last question. Is there any way that we can write last record with count to output instead of having a sort and Remove Dupl. stages after transformer.

DSguru2B wrote:And thats where you are going wrong. You need to partition it by keys. Do this, provide Hash as partitioning and choose the three keys. Now run your job again. See what happens.

DSguru2B · Post by **DSguru2B** » Fri Jan 19, 2007 3:18 pm

Not that i can think of right now. Maybe someone else knows and will shed some light on it.

anu123 · Post by **anu123** » Fri Jan 19, 2007 3:25 pm

Anyway thank you Guru and all.

DSguru2B wrote:Not that i can think of right now. Maybe someone else knows and will shed some light on it.

ray.wurlod · Post by **ray.wurlod** » Fri Jan 19, 2007 4:05 pm

Tail stage set to 1 row and executing on one node only (perhaps in sequential mode).