DataStage Wide Area Network Database Performance

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
brownnt
Premium Member
Premium Member
Posts: 21
Joined: Tue Feb 03, 2009 6:07 pm

DataStage Wide Area Network Database Performance

Post by brownnt »

We are working on a data warehouse project that involves moving data across a WAN. We have been challenged by management to leverage a Teradata database that is located 400 miles from our headquarter building where all the production data resides because of a prior investment in Teradata by another division of the company. We would be using DataStage to query over the WAN from Oracle and pull the data into the Teredata database. I would like to know if anyone has any thoughts as to if this is a good idea or if anyone has any experience doing anything like this with DataStage?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Should be fine, so long as you carefully manage their expectations as to performance (which won't be "fine").
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Depends on your job design I would say, and your volume of data.

We have a project that does a WAN access to another country. They had network slowness and dropped connections. Our admin team suggested that they fetch the data locally, process it, and then update their systems at the end of the flow.

They did various lookup tables from that country as part of their flow. Those tended to drop since they had so many and their flow would span hours.

The more connections you have over the WAN for a long timeframe the greater the chance of being dropped. Your restartability comfort level should be based upon volume of data desired from the other site, and a judgement call based on probability of failure vs cost or restarting the job.


If you have to replicate to much data locally, then don't do it. If you have many little tables you access as lookup tables in your flow, then you might want to fetch them up front and drop them to datasets.

400 miles... you're probably ok.
brownnt
Premium Member
Premium Member
Posts: 21
Joined: Tue Feb 03, 2009 6:07 pm

Post by brownnt »

We have the option to either use a DataStage environment where the source systems reside or where the target Teradata database resides. Do you think it would be better to query across the WAN from the source our push the data across the WAN to the Target?
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Where is the bulk of the work?

Typically you want your DataStage engine closer to your source systems. You'll be doing an ET then L set of jobs. The Load would be of a load ready file since you want a good restart point. You'll want to do the Extract + Transform once, then push out the result set. Should the MLOAD to TD fail, then you can restart it from the load ready file. You might want to load to a temp table then roll the temp table to your main table. That would help prevent duplicate entries should your job fail 1/2 throught.

Drop temp, load it, roll it to the main version, drop temp.

Should the nature of your transformation really bloat up your data, you might want your DataStage engine closer to your destination.

But, given the price of DataStage, you also want to look ahead and see what else you are going to be using it for. Odds are good you'll be using it for more than just feeding your TD boxes.

Keep it close to the majority of your data.
brownnt
Premium Member
Premium Member
Posts: 21
Joined: Tue Feb 03, 2009 6:07 pm

Post by brownnt »

Thanks for the help. We were also looking at the Change Data Capture plugin to reduce the amount of data going across the WAN. Has anyone else used this plugin and is it work the investment?
Post Reply