Data Migration Issue

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
getrajeshshetty
Premium Member
Premium Member
Posts: 1
Joined: Sat Sep 27, 2003 9:11 pm

Data Migration Issue

Post by getrajeshshetty »

We have designed data stage jobs and migrated small amount of test data. But now that we are ready to load the live data which is millions of records per table, we are having some issues.

Problem Scenario is as given below.
------------------------------------------------------------------------------------
Table Type Records Fields
------------------------------------------------------------------------------------
A ORAOCI8 836374 10 ( source table )
B ORAOCI8 836374 2 ( lookup table )
C ORAOCI8 - 11 ( target table )
R1 Sequenctial File - 2 ( reject file )
R2 Sequenctial File - 1 ( reject file )
R3 Sequenctial File - 10 ( reject file )
------------------------------------------------------------------------------------

Job Detail
==========
Field X from table A is linked to field Y from table B.
Table A is linked to transformer as Stream
Table B is linked to transformer as Reference
Table C is transformer output table.
R1 is the reject file for Field X value in table A not matching any of field Y value in table B.
R2 is the reject file for Field Z in table A with NULL values
R3 is the general reject file which captures all records not transfered to Target C , R1 and R2

Process Detail
==============
Process started on JUN 27 2002 at 4:19 PM and was aborted on JUN 29 2002 at 1:59 AM.
When process was aborted 41780 records were transferred to Target Table C.

Kindly note that the source table and the target table are accessed from 2 different oracle data sources

Problem
=======
First of all 2 days for migration of 836374 records is way beyond the performance we are looking forward for and even then the jobs are getting aborted before completion.Some of our tables have 25 million records to be migrated.

Do we need to relook at our jobs or is this some server installation issue...Looking forward for Reply.

Thanks.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

There are some tips for making your job run faster:

- Do not attempt to use an Oracle table as a lookup/reference stage. If tables A and B are from the same database you can join them in the select clause or create a view that joins them. Otherwise dump table B into a hash file and use this as a lookup. A hash file lookup is very fast.
- Try an Oracle bulk load instead of writing directly through and OCI stage. Oracle bulk load via direct mode is very fast. The trick is to output the data to a Sequential file stage to create the dat file and output zero rows to an Oracle bulk loader stage which will create the control file. Do not attempt to create a dat file using the Bulk Loader stage as it is much slower than the sequential file stage. Do a search in the Datastage forum archives for more info on Oracle bulk load.

- Consider running jobs in parallel. If your DataStage server is on Unix consider the DataStage Parallel extender. Otherwise turn your job into a multiple instance job and pass a different sql string to each one which partitions your data into four chunks. Process the table in four jobs. Bulk load the 4 output files sequentially.
If you have several tables to transfer then run independant jobs in parallel.

You will probably find that your processing time drops from 2 days to under 2 hours.
Post Reply