Page 1 of 1

10 Table Move

Posted: Wed Jun 28, 2006 1:34 pm
by JPalatianos
Hi,
I am brand new to DataStage and was curious what would be the best(preferred) technique for copying (no modifications) from 10 SQL Server tables to 10 DB2(Mainframe) tables.
i.e. separate job for each, a couple in each job, all in one job?

Sorry if this has been answered before but I did try searching on the topic and did not find anything.

Thanks - - John

Posted: Wed Jun 28, 2006 1:47 pm
by kcbland
Welcome aboard! Since DS is an ETL tool, it may seem cumbersome for such a simple task.

You'll probably want a job per source table that writes to a sequential text file. Each target table will have a dedicated load job that reads the appropriate text file and loads the data.

My assumption is that this is a one-time move. If you were writing true ETL, you would do things like check to see if the row already exists and then send it to an update process. But for a simple copy, where the target tables are empty, you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.

Posted: Wed Jun 28, 2006 4:07 pm
by ray.wurlod
I'd also recommend a separate pair of jobs for each, with an intermediate staging area using text files. These text files can be the data files for the DB2 bulk loader if required.

You can start as many of these jobs simultaneously as the system's capacity can bear. Try not to overload the machine, and spread the I/O load over as many disk spindles as possible.

If the data are small enough, plug in a 1GB flash disk and use that as one of your staging areas.

Posted: Thu Jun 29, 2006 7:18 am
by JPalatianos
Thanks guys! I appreciate the advice.

Posted: Thu Jun 29, 2006 10:44 am
by kumar_s
Ray,Ken - Is there any strong reason why, to keep seperate job for each table? Otherthan better understandability and maintanance.

Posted: Thu Jun 29, 2006 10:50 am
by DSguru2B
kcbland wrote: you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.
Kumar i think the strongest reason is for modularization and restartabilty.
My 2 centy.

Posted: Thu Jun 29, 2006 10:57 am
by kumar_s
Yes, the strongest point of restartability was hidden in my previous go thorugh.

Posted: Thu Jun 29, 2006 3:41 pm
by ray.wurlod
The real reason is that DataStage is metadata-driven, and I suspect that all 10 tables have different metadata.

You can't easily do "dynamic metadata" in DataStage.