Improve performance of Join of Data Sets

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
akarsh
Participant
Posts: 51
Joined: Fri May 09, 2008 4:03 am
Location: Pune

Improve performance of Join of Data Sets

Post by akarsh »

Hi All,

Even i am also facing the same issue as seen in this post.

I have two data set as input to join stage and its extracting only 276 row/sec.

below are the details

Data Set 1
-----------

Total Records: 19199366
Total 32k Blocks: 19141
Total Bytes: 2430032178

Node Records blocks Bytes
Node1 9593377 9566 1214500680
Node2 9605989 9575 1215531492

Data Set 2
-----------

Total Records: 19199355
Total 32k Blocks: 23812
Total Bytes: 3041367308

Node Records blocks Bytes
Node1 9597492 11903 1520355820
Node2 9601863 11909 1521011488

Please suggest what can be done to improve performance .

I also added the two env variable as suggested by Ravi keeping default value but didnt get any help.
Thanks,
Akarsh Kapoor
thompsonp
Premium Member
Premium Member
Posts: 205
Joined: Tue Mar 01, 2005 8:41 am

Post by thompsonp »

Akarsh

Perhaps you could follow the advice already given and post your results.
What does the rest of the job look like and are the datasets already partitioned and sorted for the join? Are they being repartitioned / sorted?

Ravi has not responded to the advice given and you have just replicated a change he made but kept default values (which is presumably the same as not adding them). There's plenty of help in that thread if you choose to follow it.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Similar != same, so now you have your own post to track your issue... which I don't believe is related to 'slow reading'. Thompsonp's questions need to be answered.
-craig

"You can never have too many knives" -- Logan Nine Fingers
akarsh
Participant
Posts: 51
Joined: Fri May 09, 2008 4:03 am
Location: Pune

Post by akarsh »

Hi thompsonp,

I am just having join the the job. and its same partition in join.
speed is 1400-1600 row/ sec.

Also have changed the buffer at join and kept it 6 MB.

i/p Meta Data is around 500 bytes and out around 1000 bytes as am having full outer join in the job.

earlier it was delete then insert job. Delete was taking long time app 19 hr, so changing it to truncate and load by saving data not delete using full outer join.
Thanks,
Akarsh Kapoor
Post Reply