DSXchange

Posted: **Thu May 06, 2010 4:01 pm**

I have a job which will do the updates to a database table .The job design is as follows :

Sequential file -------> oracle enterprise ( update only mode ) .

Please check the log below

Occurred: 3:25:11 PM On date: 5/4/2010 Type: Control
Event: Starting Job jobname. (...)

Occurred: 3:25:12 PM On date: 5/4/2010 Type: Info
Event: Environment variable settings: (...)

Occurred: 3:25:12 PM On date: 5/4/2010 Type: Info
Event: Parallel job initiated

Occurred: 3:25:12 PM On date: 5/4/2010 Type: Info
Event: OSH script (...)

Occurred: 3:25:13 PM On date: 5/4/2010 Type: Info
Event: main_program: IBM WebSphere DataStage Enterprise Edition 8.0.1.5272 (...)

Occurred: 3:25:13 PM On date: 5/4/2010 Type: Info
Event: main_program: conductor uname: -s=AIX; -r=3; -v=5; -n=pedwbt01; -m=00C1C3A04C00

Occurred: 3:25:13 PM On date: 5/4/2010 Type: Info
Event: main_program: orchgeneral: loaded (...)

Occurred: 3:46:43 PM On date: 5/4/2010 Type: Info
Event: main_program: APT configuration file: /opt/IBM/InformationServer/Server/Configurations/configiguration6x6_prod_ods.apt (...)

Occurred: 4:02:19 PM On date: 5/4/2010 Type: Info
Event: SEQF_Load_Updates,0: Progress: 10 percent.

Occurred: 4:14:31 PM On date: 5/4/2010 Type: Info
Event: SEQF_Load_Updates,0: Progress: 20 percent.

Here if we observe the log there is a lot of delay between "3:25:13" and "3:46:43" . Can anyone has any idea what exactly datastage is doing during that time ? Its taking lot of time to read the sequential file .Previous it was running very fast .At datastage job level nothing has changed .

Can someone help me in resolving the same ?

Thanks

Posted: **Thu May 06, 2010 4:32 pm**

I would guess it is actually loading the data at that stage, flushes aren't always instantaneous. Are you monitoring the run at the same time? Is data flowing during that period? How large is the target table and what throughput are you getting?

Posted: **Thu May 06, 2010 5:19 pm**

No its not loading any data into the table .The volume that its processing is nearly 1 million.

Posted: **Thu May 06, 2010 5:51 pm**

By loading I meant doing the updates.

What about the target size and throughput? For one million rows and looking at over two hours to run the updates you may have a bit of locking contention. Make sure your data is partitioned by your update key

Posted: **Fri May 07, 2010 12:52 am**

What are your array size and transaction (rows/commit) size settings?

If the latter is 0, then Oracle is building and building and building a huge transaction in the redo tablespace (rollback segment, if you prefer). The larger this gets, the slower things tend to be. The solution to that is to commit more frequently.

Updates by their very nature will be slower, because the record needs to be found in each case (via the conditions in the WHERE clause), read, modified and written back - not to mention before and after images being written to the transaction log and index records also possibly needing to be updated.

Posted: **Fri May 07, 2010 6:46 am**

I'd also be curious if the fields used as Key fields for the update are indexed. If not, that's a classic reason why they might take 'longer and longer' - full table scans.

Posted: **Fri May 07, 2010 7:51 am**

I will suggest to load into a temporary table and then merge with your target.

Do you have any duplicates in the data ?

Posted: **Fri May 07, 2010 9:22 am**

There is an index on the column which i have specified in the where clause of update query .

Commit row interval =1000

Commit time interval= 15

Could you please let me know how to find array size and transaction size so that i can provide the same details ,

Posted: **Fri May 07, 2010 9:47 am**

Those commit values are, in effect, your 'transaction size'. That being said, unless your DBAs have you cranked down pretty tight on redo/undo/rollback space, I would never commit that often for 'large' volumes.

There are two sides to every story and while Ray is (mostly) right about the use of a single end-game commit, going too far in the other direction - committing too often - can be just as bad. Me, I would either remove the commit time variable or set it way high (there's no need for it here) and then bump your row interval up significantly, either 10x or even 100x for that volume. As noted, however, you'd need to ensure with your DBA's help that there won't be any redo issues from that.

I still don't think would explain why previously with (I assume) the same volume it 'was running very fast' and now mysteriously it takes 'longer and longer'. Has your DBA attempted to help determine the cause? Traced the session and sql explain plan? Verified that the index is indeed being used for the updates? Simply having one doesn't always mean it will be used if say the stats are outdated, for example.

Posted: **Fri May 07, 2010 10:47 am**

Hi Craig ,

I understand that we should not commit too often but here i am thinking that during the time that i have mentioned in the post , datastage is not doing anything and it just waiting for somthing to happen .I would you like to know what it is ? Here as mentioned by you the transaction size is small and its not commiting anything during that time .

Thanks

Posted: **Fri May 07, 2010 10:48 am**

Sainath.Srinivasan wrote:I will suggest to load into a temporary table and then merge with your target.

Do you have any duplicates in the data ?

Hi Sai ,

Could you please explain more about the need of the temp table ?

Thanks

Posted: **Fri May 07, 2010 11:36 am**

I wasn't addressing any specific period of time and what it may (or may not) be doing then, just your concerns regarding how long it was/is taking to do the actual work.

DSXchange

Updates Running Very Slow

Updates Running Very Slow