xml performance

vishal_rastogi · Post by **vishal_rastogi** » Wed Sep 14, 2011 5:26 am

HI All

i am extracting the records from 29 oracle tables and creating the 29 xml file then merging the 29 xml file into one xml file through unix script
then zipping the merged file and doing the sftp through unix script

currently my job is taking around 30-40 sec (depends upon the data in table) to generate and sftp the xml

is there any way by which i can reduce the generation time by 10-15 sec or how i can double the throughput

ray.wurlod · Post by **ray.wurlod** » Wed Sep 14, 2011 6:15 am

Are you running the 29 jobs consecutively? Running them simultaneously (or in bunches, if you don't have the capacity for all 29) should cut some processing time.

vishal_rastogi · Post by **vishal_rastogi** » Wed Sep 14, 2011 6:18 am

i am runing the jobs in bunches
so for 29 files i have created 5 jobs each containing 6 oracle to xml input stage

chulett · Post by **chulett** » Wed Sep 14, 2011 7:21 am

Convert them to Server jobs.

(in the immortal words of BOC - don't fear the Server)

eostic · Post by **eostic** » Wed Sep 14, 2011 7:47 am

Absolutely. If they are that simple, chances are they run in less than 1 second, and the overhead is just EE Job Start up time.

Ernie

vishal_rastogi · Post by **vishal_rastogi** » Thu Sep 15, 2011 1:11 am

thanks for your inputs
jsut wanted to know is ther enay way to convert the paralllel jobs into the server jobs
and i understood your logic that i am not using paralleism and pieline concept so better to go with server jobs.

ray.wurlod · Post by **ray.wurlod** » Thu Sep 15, 2011 2:01 am

Curiously you can use both pipeline and partition parallelism in server jobs but, in your case (with small volumes) you don't need to.

eostic · Post by **eostic** » Thu Sep 15, 2011 6:26 am

No way to directly convert them, but the methodology is similar, so you shouldn't have too much difficulty, and the syntax of the xmlStage is identical.

Save the output link definition for your xmlInput stage in EE to a tabledef and you can then just "load" that into the output link of your server xmlInput Stage.

You'll still need to manually re-apply various properties in other parts of the Stage and Job.

Ernie

vishal_rastogi · Post by **vishal_rastogi** » Fri Sep 23, 2011 9:00 am

just want to know if i will create a multiple instances of the job will it going to improve the performance

eostic · Post by **eostic** » Fri Sep 23, 2011 12:04 pm

Multiple Instances don't really apply here. There are lots of things you can do with multiple instances, one of them being a 'convenience' ...to have one job design, and run it (say) 15 times concurrently, passing different job parameters to each.

Based on what we've been discussing, going to Server Jobs is going to get you a dramatic improvement for these tiny files...mostly because the processing time for parsing the information is probably not where your bottleneck is.

Ernie

FranklinE · Post by **FranklinE** » Fri Sep 23, 2011 1:57 pm

eostic wrote:...mostly because the processing time for parsing the information is probably not where your bottleneck is.

Ernie

I'm about to post an XML performance problem, and it looks like parsing is where our bottleneck is.

eostic · Post by **eostic** » Fri Sep 23, 2011 2:36 pm

Is this still the original thread?...the original discussed writing xml, so it's hard to tell what is being discussed...

chulett · Post by **chulett** » Fri Sep 23, 2011 4:02 pm

Franklin has his own thread now, let's not muck this one up with his stuff.

FranklinE · Post by **FranklinE** » Fri Sep 23, 2011 4:07 pm

chulett wrote:Franklin has his own thread now, let's not muck this one up with his stuff.

Without muck, there can be no muckraking.