Page 1 of 2

DS Job Design

Posted: Fri Mar 09, 2007 9:10 am
by aditya
All,

I have a staging table from which data would have to be loaded onto 6 different tables, only straight inserts.

The six table have a few attributes common amoung them.

I had 2 job designs in mind and I wanted your suggestions on the best strategy.

Strategy 1 :

One DS Job loading all the data into six different table involving the staging table, transfomer and six tables.

Strategy 2:

DS Job wherein the staging table would only select the fieds specific to each table and load data individually.

Thanks
Aditya.

Posted: Fri Mar 09, 2007 9:15 am
by kumar_s
Is it one staging table to 6 different table or 6 staging tables to 6 other tables.
If its the first case, the option speaks about populating the 6 target tables.
If i understand properly, But the first approach involves loading the staging table as well. Pls clarify.

Posted: Fri Mar 09, 2007 9:20 am
by DSguru2B
One job to select data, do all the transformations and then load to 6 flat files, identical to your target tables.
A job for each table.
All together 7 jobs. Keep it modular. Will help in restartability. Also keeps the segments well defined.

Posted: Fri Mar 09, 2007 9:20 am
by aditya
Kumar,

Thanks for the reply. Sorry abt being unclear while asking the question.

Actually there is only ONE STAGING TABLE. This table feeds data into SIX DIFFERENT TABLES.

Thanks
Aditya.

Posted: Fri Mar 09, 2007 9:24 am
by kumar_s
If performance squeezes, you can approach single job to avoid multiple read.

Posted: Fri Mar 09, 2007 10:15 am
by ray.wurlod
Do you have to stage into tables? Sequential files are much faster for staging areas. Are you REALLY still on version 4?

Posted: Fri Mar 09, 2007 1:40 pm
by aditya
Ray,

Unfortunately the client for which I work is still on Ardent 4.5.1 Version. They have a process of parsing complex flat files and loading into a staging table in Oracle. From where data would be picked up and later loaded to other ODS tables.

Thanks
Aditya.

Posted: Fri Mar 09, 2007 2:42 pm
by ray.wurlod
If performance is being an issue, stage into text files and use sqlldr to get the data from there into the Oracle staging tables, perhaps in an after-job subroutine running ExecSH. You'll be amazed how much faster this can go. Don't use the ORABULK stage though - its performance is abysmal.

Posted: Fri Mar 09, 2007 11:20 pm
by kumar_s
Do you mean Sql loader used by Datastage is not scalable. Or its with the version?

Posted: Sat Mar 10, 2007 7:10 am
by chulett
I'm assuming he's specifically referring to the old ORABULK stage, which I don't really recall any specifics of. The newer ORAOCIBL stage in automatic mode loads data as fast as you can shove it into it. :wink:

Posted: Sat Mar 10, 2007 8:39 am
by ray.wurlod
The newer ORAOCIBL stage doesn't exist in version 4.x, however. To quote Mr Duke, the ORABULK stage serks!

Posted: Sun Mar 11, 2007 6:20 pm
by oacvb
Is it direct load from Staging to all 6 tables or are you doing any transformations in middle. If there are no transformations in between, use 6 select reads and insert into tables. Do it parralelly so that the performance might increase if DB doesn't have bottleneck. My suggestion is write 6 different jobs and call it parallely so that you have restartability if it fails for one table. Try to use ORABULK but not sure it's availability in version 4

Posted: Sun Mar 11, 2007 6:44 pm
by kumar_s
6 repeated reads on the single table for restratability is not a wise approach.
Read one and write into 6 flat file and use SQL loader to load it as noted, or read and write in the same job as one option that you have mentioned.

Posted: Sun Mar 11, 2007 7:02 pm
by oacvb
It's again depends on how Database and network servers are configured as well with number of CPU's (Environment) probably he can do a POC and go ahead based on that.

Posted: Sun Mar 11, 2007 7:05 pm
by kumar_s
Even if database is very fine tuned and has high band width, its not worth hitting the database again and again for the same data. IMHO.