xml performance

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vishal_rastogi
Participant
Posts: 47
Joined: Thu Dec 09, 2010 4:37 am

xml performance

Post by vishal_rastogi »

HI All

i am extracting the records from 29 oracle tables and creating the 29 xml file then merging the 29 xml file into one xml file through unix script
then zipping the merged file and doing the sftp through unix script

currently my job is taking around 30-40 sec (depends upon the data in table) to generate and sftp the xml

is there any way by which i can reduce the generation time by 10-15 sec or how i can double the throughput
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you running the 29 jobs consecutively? Running them simultaneously (or in bunches, if you don't have the capacity for all 29) should cut some processing time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vishal_rastogi
Participant
Posts: 47
Joined: Thu Dec 09, 2010 4:37 am

Post by vishal_rastogi »

i am runing the jobs in bunches
so for 29 files i have created 5 jobs each containing 6 oracle to xml input stage
Vish
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Convert them to Server jobs. :wink:

(in the immortal words of BOC - don't fear the Server)
Last edited by chulett on Wed Sep 14, 2011 7:51 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Absolutely. If they are that simple, chances are they run in less than 1 second, and the overhead is just EE Job Start up time.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
vishal_rastogi
Participant
Posts: 47
Joined: Thu Dec 09, 2010 4:37 am

Post by vishal_rastogi »

thanks for your inputs
jsut wanted to know is ther enay way to convert the paralllel jobs into the server jobs
and i understood your logic that i am not using paralleism and pieline concept so better to go with server jobs.
Vish
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Curiously you can use both pipeline and partition parallelism in server jobs but, in your case (with small volumes) you don't need to.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

No way to directly convert them, but the methodology is similar, so you shouldn't have too much difficulty, and the syntax of the xmlStage is identical.

Save the output link definition for your xmlInput stage in EE to a tabledef and you can then just "load" that into the output link of your server xmlInput Stage.

You'll still need to manually re-apply various properties in other parts of the Stage and Job.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
vishal_rastogi
Participant
Posts: 47
Joined: Thu Dec 09, 2010 4:37 am

Post by vishal_rastogi »

just want to know if i will create a multiple instances of the job will it going to improve the performance
Vish
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Multiple Instances don't really apply here. There are lots of things you can do with multiple instances, one of them being a 'convenience' ...to have one job design, and run it (say) 15 times concurrently, passing different job parameters to each.

Based on what we've been discussing, going to Server Jobs is going to get you a dramatic improvement for these tiny files...mostly because the processing time for parsing the information is probably not where your bottleneck is.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

eostic wrote:...mostly because the processing time for parsing the information is probably not where your bottleneck is.

Ernie
I'm about to post an XML performance problem, and it looks like parsing is where our bottleneck is.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Is this still the original thread?...the original discussed writing xml, so it's hard to tell what is being discussed...
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Franklin has his own thread now, let's not muck this one up with his stuff. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

chulett wrote:Franklin has his own thread now, let's not muck this one up with his stuff. :wink:
Without muck, there can be no muckraking. :P
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Post Reply