Page 1 of 1

Dataset reading problem when envrionment changed.

Posted: Wed Sep 07, 2011 1:03 am
by mrvsr
Hi,

I have designed one job with

Dataset ---> copy----------->copy

In One environment it takes 4 mins to read dataset with 20 million records, In other environment it takes 30+ minutes to finish.

I am not sure whats going wrong.
In both the environment OS is same DS version is same.

1. the DS file generated in same way
2. iostat same.
3. changed/increased buffer settings but no improvement.
4. resource availability same
5. config file same except node names.
6. row/sec different in both environments.(100000 vs 10000)
7. No schema differences.
8. Dataset sizes are same in both environments.
9. Job is multi instance but running single instance in both envs.
10. when added PM PLAYER TIMING ds operator taking long time.

In same job reading dataset taking much more time in one environment than other but other jobs also using other datasets are not showing that much difference.

Please let me know how to fix this issue.

Thanks in advacne.

Posted: Wed Sep 07, 2011 1:10 am
by ray.wurlod
What precisely is the "issue" that you seek help to fix?

Are you reading from a Data Set that was written when a different configuration file was in force?

Posted: Wed Sep 07, 2011 7:15 am
by mrvsr
Thanks Ray for your reply.

No. Same configuration file used to write and read the dataset.
One job writes to dataset and another reads from it .

these 2 jobs running faster in one environment but in other environment first job that writes to data sets runs for same time as other environment but reading job taking longer time than other environment.
even reading one field also taking longer time than other environment.


I want to fix this reading job to run for same time in both the environments.

Posted: Wed Sep 07, 2011 5:35 pm
by ray.wurlod
Then you must answer the obvious question - what is different between the two environments? "Nothing" is clearly not the correct answer.