Hi,
I am brand new to DataStage and was curious what would be the best(preferred) technique for copying (no modifications) from 10 SQL Server tables to 10 DB2(Mainframe) tables.
i.e. separate job for each, a couple in each job, all in one job?
Sorry if this has been answered before but I did try searching on the topic and did not find anything.
Thanks - - John
10 Table Move
Moderators: chulett, rschirm, roy
Welcome aboard! Since DS is an ETL tool, it may seem cumbersome for such a simple task.
You'll probably want a job per source table that writes to a sequential text file. Each target table will have a dedicated load job that reads the appropriate text file and loads the data.
My assumption is that this is a one-time move. If you were writing true ETL, you would do things like check to see if the row already exists and then send it to an update process. But for a simple copy, where the target tables are empty, you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.
You'll probably want a job per source table that writes to a sequential text file. Each target table will have a dedicated load job that reads the appropriate text file and loads the data.
My assumption is that this is a one-time move. If you were writing true ETL, you would do things like check to see if the row already exists and then send it to an update process. But for a simple copy, where the target tables are empty, you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I'd also recommend a separate pair of jobs for each, with an intermediate staging area using text files. These text files can be the data files for the DB2 bulk loader if required.
You can start as many of these jobs simultaneously as the system's capacity can bear. Try not to overload the machine, and spread the I/O load over as many disk spindles as possible.
If the data are small enough, plug in a 1GB flash disk and use that as one of your staging areas.
You can start as many of these jobs simultaneously as the system's capacity can bear. Try not to overload the machine, and spread the I/O load over as many disk spindles as possible.
If the data are small enough, plug in a 1GB flash disk and use that as one of your staging areas.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 306
- Joined: Wed Jun 21, 2006 11:41 am
Kumar i think the strongest reason is for modularization and restartabilty.kcbland wrote: you're best to segregate extraction and loading processing not only for simplicity, but for ease in tuning, restart, etc.
My 2 centy.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The real reason is that DataStage is metadata-driven, and I suspect that all 10 tables have different metadata.
You can't easily do "dynamic metadata" in DataStage.
You can't easily do "dynamic metadata" in DataStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.