Improving Performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
soporte
Premium Member
Premium Member
Posts: 37
Joined: Wed Feb 08, 2006 12:33 pm
Location: Argentina
Contact:

Improving Performance

Post by soporte »

Hi,

I need to join a DataSet (left link) with 66 millons of records with a Sequential File (right link) with 28 millons of records.

I tried reading the Sequencial File with Sequential Stage 1, 2 and 4 readers per node but the importing process to the virtual DataSet is taking a lot of time.

1) Is there any tip to improve the importing process of sequential files?
2) If I need to join / merge two big sequential files (>20M records), is it posible to join / merge them without importing them to a virtual dataset in DataStage EE?. If no, what is the best way to do this?

Thx
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1. Not really. Multiple readers is about all you have unless you chop up the file first.

2. No. Sparse lookups are only available for DB2 and Oracle Enterprise stages. There is a virtual Data Set associated with very other link between non-combined operators.

You *may* get some gain by preventing the Join stage from combining, but only if you have spare CPU and memory capacity. You might also consider increasing the memory consumed by the Join stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
John Smith
Charter Member
Charter Member
Posts: 193
Joined: Tue Sep 05, 2006 8:01 pm
Location: Australia

Post by John Smith »

what do you mean by slow? how long does it take for your server to import to a virtual dataset? is your sequential file created/exist in a filesystem that spans multiple disks ?
make sure your scratch disk is not in the same filesystem as your sequential files. no point tuning anything in DS when you have disk contention in your OS!
just make sure that you are not getting a lot of hits in the single disk, if you are then you're not going to get much help in DS.
may be worth getting your aix admin to have a check.
Post Reply