Page 1 of 1
regarding Performance
Posted: Tue Jan 31, 2006 2:46 am
by sudhakar_viswa
Hi,
To check the performance how many records are needed.usually i am taking 10 to 20 records.
Thanks,
sudhakar
Posted: Tue Jan 31, 2006 2:57 am
by ArndW
That is a very small sample and won't return meaningful results; any slight changes in the system (a couple of other process doing a bit of work during your sampling period) will give you wildly different results.
For DataStage jobs I won't use any sample less than about 5 minutes run and preferably longer. Plus I'll run that several times over time to see if I get a large standard deviation on the speeds achieved.
Posted: Tue Jan 31, 2006 3:03 am
by sudhakar_viswa
Hi ARND,
I want the number i.e no.of records are needed to check the performance
bye,
sudhakar
Posted: Tue Jan 31, 2006 3:17 am
by ArndW
The answer is enough rows to make you job run at least several minutes. I don't know your job or configuration; some installations are happy to get 500 row/second while others get 40,000/second.
The sample should be large enough to even out other system factors. Your standard deviation for repeating runs should be small; with 10-20 records your deviation will be huge and the resulting statistics won't mean anything, even on a lightly loaded Windows server. There are cache and buffers built into every aspect of a system (disk drive, disk controller, disk buffer memory, CPU cache, etc.) so by using a small sample you might get some great speeds because everything is accomplished in cache. That reminds me of a performance monitoring test that I wrote for a large health insurance company going to an EMC disk array. I had it fire off 3000 users simultaneously that did hundreds of thousands of simulated user queries, processed them and wrote data back. The test was supposed to stress-test the disk I/O subsystem for at least 12 hours; but it ran in under 5 seconds because the EMC had stored the whole database in it's cache...
Posted: Tue Jan 31, 2006 5:57 am
by sudhakar_viswa
Hi arnd,
Thanks for your reply.I am asking in general not for my scenario.
Thanks,
sudhakar
Posted: Tue Jan 31, 2006 6:07 am
by ArndW
Sudhakar,
I am trying to illustrate that there no set minimum number of rows to give reliable performance statistics. You need to achieve a minimum Job runtime (the longer the better to make the job startup times play a smaller role) and a low std. deviation between test runs. How many rows it takes to do this is irrelevant.