DSXchange

Posted: **Fri Aug 28, 2009 10:18 am**

Hi All,

There is this job which is curently running on 1 node , i made few changes to this job and ran the job on 1 node itself and its working fine , now i wanted to check the performance running on 4 nodes .
the job ran successfully but when i checked the dataset ( which is what i load at the end of my job) , i see that the data is flowing only through 1 node and the rest 3 nodes have 0 records through them .

ANy light on why this could be happening?

Any help is highly appraciated.

Thanks
Shalini

Posted: **Fri Aug 28, 2009 12:50 pm**

What kind of partitioning are you doing in the job?

Posted: **Fri Aug 28, 2009 12:52 pm**

I have set the partitioning to Auto itself

Posted: **Fri Aug 28, 2009 12:55 pm**

OK, how about some details of your job design? Anything running in Sequential rather than Parallel mode?

Posted: **Fri Aug 28, 2009 1:01 pm**

Is the source a sequential file or very small volume ?

Posted: **Fri Aug 28, 2009 1:14 pm**

The Job is doing several joins. ( all the joins are necessary) and the data is sorted within the database stages itself using the order by clause. but the partitioing i have left it to auto since there were several joins and i thought specifying the partitioing and sorting would probably degrae the performance. hence used auto instead. finally a dataset gets loaded.

Join Stage1---5 DBStages---output1
output1 joined with Dbsatge ---output2
....
...
...

similarly after 6 such joins the data is passed through the transformer and then loads the dataset.

Hope i am fairly clear in explaining with my design

Thanks in advance

Posted: **Fri Aug 28, 2009 1:57 pm**

If your volume is small, only on node will be used. Increasing nodes doesn't necessarily mean better throughput. It may take longer for the job to setup with 4 nodes. Add both the startup and CPU time to determine which is the optimal setting. I have found that one node is more efficient most of the time.

Posted: **Tue Sep 01, 2009 1:39 am**

How many records are being processed through this job (including on each side of each join)?

Add environment variable $APT_DUMP_SCORE with value True and rerun the job, then copy and paste the output from the job log here please.

DSXchange

Issues with Job running on multiple nodes

Issues with Job running on multiple nodes