performance tuning

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vij
Participant
Posts: 131
Joined: Fri Nov 17, 2006 12:43 am

performance tuning

Post by vij »

Hi,

According to my requirements, I have to use a sequential file with more than 20 Million of records to load a job. I thought instead of using a sequential file directly, if i use a dataset which is loaded by the said sequentila file, and use this dataset in the job, the performance would be better. pls correct me if i understood it in a wrong way.

Thanks
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

If your requirement is to use the data for more that once, rather i would say twice in you job desing, you can approach the dataset.
You need to populate dataset using one job, which will have its own i/o, dataset will also be read and written. But will parrellel execution. So first explain on whats your job design.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You're reading the sequential file in either case. So introducing a Data Set will yield no overall gain, though time-shifting the read process may have scheduling advantages in your situation.

Have you investigated the "multiple readers per node" option for reading the sequential file?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply