Aggregator Stage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Aggregator Stage

Post by ak77 »

Hello again,

I started this issue in a previous post but the problem is little differnt from where i started
So I am posting a new one
I used -n option in the sort, it worked
That is for a small dataset

When I ran it for bigger dataset. I am getting this error

Code: Select all

ds_ipcgetnext - timeout waiting for mutex
My question

I am sorting the data before entering the Aggregator stage
If I dont sort again in the input of the Aggregartor stage. I think I am getting this error
And I also found that it takes a long time to get ouput from the Aggregator stage

If I sort again in the aggregator stage, then the output seems to flow little faster

Can somebody explain this to me?

Thanks
Kishan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: Aggregator Stage

Post by chulett »

ak77 wrote:If I sort again in the aggregator stage, then the output seems to flow little faster
This is a common misunderstanding. You are not "sorting again" - you are asserting the already sorted order of the incoming data. If it has no clue about the incoming data it will take the time to 'sort' it again to support the aggregation being performed. If you've done the presorting and tell the stage you've done so, it will flow data through on a group change and not wait until it has everything to start producing output.

Of course, the sort must be appropriate for the grouping being performed and if you lie to the stage it *will* bust you. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Thanks Chulett,

This means should I give the sort order in the input of the aggregator stage again .

If i dont give that order, it will go ahead sort it again

Thanks

Kishan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That's correct.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Hi everybody,

help me with this

when I use this command in the before job subroutine

Code: Select all

Executed command: sort -t'|' -u -T tmpdirectory -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 file1 file2 file3 file3 > outfile
I get the error message

Code: Select all

Row out of sequence
At row 13990, link "input"
Row out of sequence
But when i use this command

Code: Select all

sort -n -t'|' -u -T tmpdirectory -k1,1 -k2,2 -k3,3 -k4,4 -k5,5  -k1,1 -k2,2 -k3,3 -k4,4 -k5,5 file1 file2 file3 file3 > outfile

i get the error message

Code: Select all

 Row out of sequence
At row 2, link "input"
Row out of sequence
i found the data was also not sorted

field 1 -- char(2)
field 2 -- varchar2(14)
field 3 -- char(2)
field 4 -- number(7)
field 5 -- number(10)
field 6 -- char(3)

thanks
Kishan
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

A UNIX sort is a string sort by default. Look on the man page for the use of the -n option for the numeric keys.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply