how many tables can be loaded using single job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

This is a great thread because it illustrates the flexibility in DataStage to create a solution in lots of ways, and also shows that there are important functional AND non-functional decisions that have to be made.

Obviously, there aren't any realistic limits here...you can write to a whole lot of tables, and as Arnd noted, 10 is not very many.

...but 10 jobs vs one Job? Clearly there are maintenance, support, debugging and management issues to consider, and the threads above highlight the fact that "one huge job" increases complexity. It might cost you far more in time and energy and future maintenance (you may not be the one who has to update or repair the job in the future) if you go with one Job. One large Job also takes away simple flexibility the first time someone says "I need you to re-run the load for ONLY table 7".

You didn't say how long it takes to run. There are great performance points up above regarding the source, re-reading it, etc. Depending on the source, performance could dictate the choice for you. If it were xml, for example, there are major benefits to reading an xml document only once and then parsing it many ways into different output links. And regardless of the source type, if the Job takes many hours to run, performance is a premium. ....but if the Job runs in 5 minutes or collectively as individual Jobs in 10 minutes, and you have a wide open batch window, opt for better long term maintenance and simpler debugging.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
sathyak098
Participant
Posts: 23
Joined: Tue May 14, 2013 6:34 am

Post by sathyak098 »

Hi,

How will it work?. Because target Datasets have diff metatada
SURA wrote:If i would be in your position, i will create a master job which will do all the transformations for all the target tables and my final targetwill be dataset files. Then have on multiple instance job with RCP to load the data.
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

RCP
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
sathyak098
Participant
Posts: 23
Joined: Tue May 14, 2013 6:34 am

Post by sathyak098 »

Hi,
If possible, Could you be more specific?
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Datastage provides the flexibility to use the same job to load different metadata. Runtime column propagation allows you to do that. You can find this in the Developers guide provided by IBM.

Please understand how this will work ,ensure that how much this approach fits for your situation. If you are happy and comfort that's it! Give it a try.

Seniors & gurus also given their comments for you. Please do read the other answers carefully and choose your option.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Runtime Column Propagation (RCP) is a technique for dealing with dynamic metadata in DataStage jobs. You can find more information in the Parallel Job Developer's Guide or in the IBM Information Center
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply