10 Table Move

JPalatianos · Post by **JPalatianos** » Wed Jun 28, 2006 1:34 pm

Hi,
I am brand new to DataStage and was curious what would be the best(preferred) technique for copying (no modifications) from 10 SQL Server tables to 10 DB2(Mainframe) tables.
i.e. separate job for each, a couple in each job, all in one job?

Sorry if this has been answered before but I did try searching on the topic and did not find anything.

Thanks - - John

kcbland · Post by **kcbland** » Wed Jun 28, 2006 1:47 pm

Welcome aboard! Since DS is an ETL tool, it may seem cumbersome for such a simple task.

You'll probably want a job per source table that writes to a sequential text file. Each target table will have a dedicated load job that reads the appropriate text file and loads the data.

My assumption is that this is a one-time move. If you were writing true ETL, you would do things like check to see if the row already exists and then send it to an update process. But for a simple copy, where the target tables are empty, you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.

ray.wurlod · Post by **ray.wurlod** » Wed Jun 28, 2006 4:07 pm

I'd also recommend a separate pair of jobs for each, with an intermediate staging area using text files. These text files can be the data files for the DB2 bulk loader if required.

You can start as many of these jobs simultaneously as the system's capacity can bear. Try not to overload the machine, and spread the I/O load over as many disk spindles as possible.

If the data are small enough, plug in a 1GB flash disk and use that as one of your staging areas.

JPalatianos · Post by **JPalatianos** » Thu Jun 29, 2006 7:18 am

Thanks guys! I appreciate the advice.

kumar_s · Post by **kumar_s** » Thu Jun 29, 2006 10:44 am

Ray,Ken - Is there any strong reason why, to keep seperate job for each table? Otherthan better understandability and maintanance.

DSguru2B · Post by **DSguru2B** » Thu Jun 29, 2006 10:50 am

kcbland wrote: you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.

Kumar i think the strongest reason is for modularization and restartabilty.
My 2 centy.

kumar_s · Post by **kumar_s** » Thu Jun 29, 2006 10:57 am

Yes, the strongest point of restartability was hidden in my previous go thorugh.

ray.wurlod · Post by **ray.wurlod** » Thu Jun 29, 2006 3:41 pm

The real reason is that DataStage is metadata-driven, and I suspect that all 10 tables have different metadata.

You can't easily do "dynamic metadata" in DataStage.