Sorting Issues...

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Sorting Issues...

Post by rasi »

Hi,

My job is creates 12 million records which need to be agregate and then insert into the table. I am doing a pre-sort and using the sort order for the sorted columns. But when I use for 12 million records it abort saying row out of sequence. Whereas if I use it for few hundred thousand I is working fine.

Can anyone help.
Rasi
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your sorted data does not match the information you put into the aggregator. Either you setup the aggregator stage incorrectly, or your sorting is not correct.

Kenneth Bland
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I second Ken's diagnosis. Can you post your sorting criteria and the contents of the Inputs grid of the Aggregator stage?

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Hi Ray,

I have totally 11 columns in the input grid and 11 columns in the output grid. Out of 11 column 10 column is pre-sorted before using aggregator stage. All the 10 sorted column is enabled with the Group button and the left out one column does the sum for the group.

I had checked the input order and the output order both are same. And one more thing which made this job to run is when I removed the sort order from the Aggregator stage it worked. But still it should work with the sort order enabled.

Cheers
Rasi
degraciavg
Premium Member
Premium Member
Posts: 39
Joined: Tue May 20, 2003 3:36 am
Location: Singapore

Post by degraciavg »

quote:Originally posted by rasi

Out of 11 column 10 column is pre-sorted before using aggregator stage. All the 10 sorted column is enabled with the Group button and the left out one column does the sum for the group.



How was the pre-sorting done? Was the data sorted from source or by Sort stage?

The key is to make sure the column sequence in the Order By clause or Sort Stage is the same as the Aggregator stage. Also make sure that the Sort order (whether ascending or descending) of each field is the same.

If you have done a thorough check on the program and the problem still persists, you will have to check your resources esp swap disk space.

If you have limited resources, it might be wiser to partition your data and do the aggregation for each partition. What is your DS verion?

regards,
vladimir
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

Rasi,

Vladimir pretty well covered the answer. But put in shorter terms, the sort column in the aggregator stage is only for indicating that the incoming data is already sorted.

Steve
rasi
Participant
Posts: 464
Joined: Fri Oct 25, 2002 1:33 am
Location: Australia, Sydney

Post by rasi »

Hi Vladimir,

Pre-sorting was done in unix. Datastage version is 6. As I mentioned the column sequence and the order is proper in the input and output Aggregator stage. I too had checked the resource it is fine. And the thing is that if I remove the sort order and run it is running fine and I am getting the result. So this should take more resource compared with sort order.

Cheers
Rasi
degraciavg
Premium Member
Premium Member
Posts: 39
Joined: Tue May 20, 2003 3:36 am
Location: Singapore

Post by degraciavg »

Hi Rasi,

The Aggregator stage performs better when the input data is sorted. It doesn't consume more resource than when input data is not sorted. If you don't get the error when you remove it, then you don't have a resource problem. The "out of sequence" error in this case means that your input data is definitely not sorted.

Do you do any lookup before you aggregate? Is your lookup data sorted?

You may try this experiment...
1. create another job that will sort your input data and stage it into a new sequential (use Sort stage)
2. and then use this sequential file as the input of your aggregator stage (use the same Sort Order)

Let us know the results...

Regards,
vladimir
Post Reply