Updating Dimension table taking too Long

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ksmurthys
Participant
Posts: 35
Joined: Mon Aug 23, 2004 3:02 pm

Updating Dimension table taking too Long

Post by ksmurthys »

Hi ALL,

My product dimension is 600000 rows n tried to update that table taking too long( My product no is CHAR field ). Update action is updating existing or insert new rows.
Job design is

ODBC staging table ----> Transformer -----> ODBC dimesion table
Update action is updating existing or insert new rows n table is only six columns.


Any suggestions how to improve the performance ??
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

1. Analyze source system server load and make sure that machine is not overwhelmed right now.
2. Analyze target system server load and make sure that machine is not overwhelmed right now.
3. Extract your data to file to minimize delays imposed on source system because of target system drag. ODBC --> XFM --> SEQ
4. Separate inserts from updates and allow INSERTS to use INSERT only SQL, as well as allow updates to do UPDATE statements only.
5. Transform your data into sequential load files so that you can see the true time it takes to just insert data, and also see how long just pure updates take.

Historically, updating data takes longer than inserting data. Maybe today you have a lot of updates, whereas yesterday you had more inserts. You really need to do fundamental elementary analysis before we can help you. "My job is slow" usually doesn't help us out much. Good luck.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
KeithM
Participant
Posts: 61
Joined: Thu Apr 22, 2004 11:34 am
Contact:

Post by KeithM »

Based on your job design it appears that you are not doing any change data detection and are updating all 600,000 rows. If that is true I would question whether it is necessary to update every row every time your job runs. If only a small percentage of the records are changing on an incremental basis, you have a lot of overhead going on that is not necessary. Search the forum for 'CRC32' which will show you posts on creating jobs that determine when a record has changed. Also search for 'SCD' or 'slowly changing dimensions' for example of splitting inserts and updates into separate data flows. This will also help improve performance. I hope this helps.
Keith
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Some methods are:
1.) Use surrogate key values (integer) instead of char
2.) Use direct-call / wired stage instead of ODBC
3.) Use IPC stage (if you have multiple processors)
4.) Use partition in your DB and load data split by partitions
5.) Use options to merge table (such as Oracle Merge command)
6.) Remove columns you do not intend to populate
ksmurthys
Participant
Posts: 35
Joined: Mon Aug 23, 2004 3:02 pm

Post by ksmurthys »

Sainath.Srinivasan wrote:Some methods are:
1.) Use surrogate key values (integer) instead of char
COuld u pls explain more in detail??
Sainath.Srinivasan wrote: 2.) Use direct-call / wired stage instead of ODBC
My source system is AS/400 which i can connect through ODBC driver becose its located in Some other place.
Sainath.Srinivasan wrote: 3.) Use IPC stage (if you have multiple processors)
We are not using multiple processors
Sainath.Srinivasan wrote: 4.) Use partition in your DB and load data split by partitions
I don't know how to implement partitions in SQL server.Any tips?
Sainath.Srinivasan wrote: 5.) Use options to merge table (such as Oracle Merge command)
No idea about this stage my target is SQLSERVER database
Sainath.Srinivasan wrote: 6.) Remove columns you do not intend to populate
Here i am populating only five columns
Post Reply