Count of Records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
devidotcom
Participant
Posts: 247
Joined: Thu Apr 27, 2006 6:38 am
Location: Hyderabad

Count of Records

Post by devidotcom »

Hi,

I need the count of records in a paralled job. I separate the data based on the conditions (constraint) from the transformer stage and send them to the aggregator stage to count the records.

I had a quick question here do I need to sort the data based on the key column before sending it to the target as I am using a parallel DS job or would the job do it correctly without the sort stage.

Thanks in advance
us1aslam1us
Charter Member
Charter Member
Posts: 822
Joined: Sat Sep 17, 2005 5:25 pm
Location: USA

Post by us1aslam1us »

It is always good practice to sort the data before performing any aggregation. In your case if you just want to count the records processed or filtered, I suggest you better look for the DSGetLinkInfo().
I haven't failed, I've found 10,000 ways that don't work.
Thomas Alva Edison(1847-1931)
devidotcom
Participant
Posts: 247
Joined: Thu Apr 27, 2006 6:38 am
Location: Hyderabad

Post by devidotcom »

Thanks for the reply.
But i do not relie on the link count because the link count behave weird at times and give some different count I don't know why.
siddesai
Participant
Posts: 26
Joined: Thu Apr 26, 2007 11:28 pm

Post by siddesai »

devidotcom wrote:Thanks for the reply.
But i do not relie on the link count because the link count behave weird at times and give some different count I don't know why.
Does anyone know why the link gives weird numbers? Is that a bug?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We have only your assertion that the numbers are "weird". What do you mean by this term? Where's the proof?

Have you tried setting the environment variable that causes player processes to report their row counts?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
abc123
Premium Member
Premium Member
Posts: 605
Joined: Fri Aug 25, 2006 8:24 am

Post by abc123 »

For getting a count, you don't need to sort. It doesn't matter how many nodes you have, an aggregator will do. It is probably the simplest of aggregator functions.
Maveric
Participant
Posts: 388
Joined: Tue Mar 13, 2007 1:28 am

Post by Maveric »

If you want the total record count then you don't have to sort. If you want the count based on a key field then sort and hash partition on the key.
Post Reply