performance tuning

vij · Post by **vij** » Thu Jan 04, 2007 10:32 pm

Hi,

According to my requirements, I have to use a sequential file with more than 20 Million of records to load a job. I thought instead of using a sequential file directly, if i use a dataset which is loaded by the said sequentila file, and use this dataset in the job, the performance would be better. pls correct me if i understood it in a wrong way.

Thanks

kumar_s · Post by **kumar_s** » Thu Jan 04, 2007 10:57 pm

If your requirement is to use the data for more that once, rather i would say twice in you job desing, you can approach the dataset.
You need to populate dataset using one job, which will have its own i/o, dataset will also be read and written. But will parrellel execution. So first explain on whats your job design.

ray.wurlod · Post by **ray.wurlod** » Thu Jan 04, 2007 11:47 pm

You're reading the sequential file in either case. So introducing a Data Set will yield no overall gain, though time-shifting the read process may have scheduling advantages in your situation.

Have you investigated the "multiple readers per node" option for reading the sequential file?