DS Tuning Options
Moderators: chulett, rschirm, roy
DS Tuning Options
Hi,
I created a job. which is very simple.
1. Read EBCDIC file --> Transform --> Insert/Update Table.
We are loading around - 16300 Records.
Right now it is taking around 50 Sec.
2. After this I did this
Read EBCDIC file --> Transform --> IPC --> Insert/Update Table.
This takes around - 40 Sec.
3. Read EBCDIC file --> Link Partition [ Round Robin ] Transform --> IPC --> Insert/Update Table.
I did around 6 Partition. This takes around -30 Sec.
I want the performance LESS than 20 sec. My Oracle Scripts runs in 16 Sec.
If some one can suggests any other options. So that I tune this job.
Let me know if you need more information,,
Thanks
parag K.
I created a job. which is very simple.
1. Read EBCDIC file --> Transform --> Insert/Update Table.
We are loading around - 16300 Records.
Right now it is taking around 50 Sec.
2. After this I did this
Read EBCDIC file --> Transform --> IPC --> Insert/Update Table.
This takes around - 40 Sec.
3. Read EBCDIC file --> Link Partition [ Round Robin ] Transform --> IPC --> Insert/Update Table.
I did around 6 Partition. This takes around -30 Sec.
I want the performance LESS than 20 sec. My Oracle Scripts runs in 16 Sec.
If some one can suggests any other options. So that I tune this job.
Let me know if you need more information,,
Thanks
parag K.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Andrw,
If without tranformation I am writing to a file its takes 15 sec.
I tried to split Insert and update. It takes same time 45 to 50 sec.
Its Oracle Stage and Update Query is Tune.
Srini,
Array size is 1000 and In mapping only three transformations are present.
I cant insert more than 1 IPC.
I have some logic in transformations.
Thanks for your suggestions. If you have an other options let me know.
Parag K.
If without tranformation I am writing to a file its takes 15 sec.
I tried to split Insert and update. It takes same time 45 to 50 sec.
Its Oracle Stage and Update Query is Tune.
Srini,
Array size is 1000 and In mapping only three transformations are present.
I cant insert more than 1 IPC.
I have some logic in transformations.
Thanks for your suggestions. If you have an other options let me know.
Parag K.
-
- Participant
- Posts: 82
- Joined: Thu Dec 02, 2004 10:27 pm
- Location: INDIA
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Right now our jobs are in oracle and Unix .. U can say we are doing kind of benchmarking before starting full development...
Time is very critical because its fin org. Some jobs are daily and some monthly. Its huge TB database and everyday we are loading millions and millions records..
Do I have to set any Administration parameters ?
Thanks for u responses ..
Parag K
Time is very critical because its fin org. Some jobs are daily and some monthly. Its huge TB database and everyday we are loading millions and millions records..
Do I have to set any Administration parameters ?
Thanks for u responses ..
Parag K
Parag_pk,
when a job runs for under a minute of either CPU or realtime a very significant proportion of that time is spent doing things that are only indirectly related to the data itself. This includes reading internal DataStage tables, preprocessing, forking processes, preparing and opening log files and setting up processing environments. Let us put the overhead in your 45-second run at 10 seconds; this means that your performance numbers include a 20% overhead. This overhead is constant, so if your job were to process double the number of rows it would run 80 seconds, bringing your overhead down to 12%. So what the others and I are saying is that your sample is far too small for either analysis of potential tuning or of expected runtimes.
I think that any DS job that runs for less than a couple of minutes is not worth the time spent tuning it; and unless you are in a real-time environment (in which case a DS job the way you have written it is not the correct approach) it should be for you as well. If you are doing benchmarking for your application development then I would highly recommend letting a job run 10-15 minutes to see the Rows-per-second and to identify tuning opportunities.
All of us have, at one time or other, dealt with "huge" amounts of data. Loading millions of records in minimum time is part of our daily bread-and-butter.
If performance is such a critical aspect, then your site might consider using Px; but with a data sample run of under a minute I think that Px will actually be significantly slower than Server; but it does scale rather well
(and since I'm no longer an Asc employee and don't have to put a positive spin on everything that is a real compleiment)
when a job runs for under a minute of either CPU or realtime a very significant proportion of that time is spent doing things that are only indirectly related to the data itself. This includes reading internal DataStage tables, preprocessing, forking processes, preparing and opening log files and setting up processing environments. Let us put the overhead in your 45-second run at 10 seconds; this means that your performance numbers include a 20% overhead. This overhead is constant, so if your job were to process double the number of rows it would run 80 seconds, bringing your overhead down to 12%. So what the others and I are saying is that your sample is far too small for either analysis of potential tuning or of expected runtimes.
I think that any DS job that runs for less than a couple of minutes is not worth the time spent tuning it; and unless you are in a real-time environment (in which case a DS job the way you have written it is not the correct approach) it should be for you as well. If you are doing benchmarking for your application development then I would highly recommend letting a job run 10-15 minutes to see the Rows-per-second and to identify tuning opportunities.
All of us have, at one time or other, dealt with "huge" amounts of data. Loading millions of records in minimum time is part of our daily bread-and-butter.
If performance is such a critical aspect, then your site might consider using Px; but with a data sample run of under a minute I think that Px will actually be significantly slower than Server; but it does scale rather well
![Cool 8)](./images/smilies/icon_cool.gif)
Last edited by ArndW on Sat Feb 26, 2005 12:41 pm, edited 1 time in total.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
Hi All,
Thank you all for giving suggestions. I partially agree with you guys. I will try to take different job for benchmarking..
Still I feel If your Database, Network and Hardware is tune and strong and Compare to other ETL tools. DS should do better job than this, But this is not a forum for discussion.
So, Thanks Again anyway ...
Parag K.
Thank you all for giving suggestions. I partially agree with you guys. I will try to take different job for benchmarking..
Still I feel If your Database, Network and Hardware is tune and strong and Compare to other ETL tools. DS should do better job than this, But this is not a forum for discussion.
So, Thanks Again anyway ...
Parag K.