Updates Running Very Slow

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_teg
Premium Member
Premium Member
Posts: 51
Joined: Tue Aug 11, 2009 6:53 am
Location: Chicago

Updates Running Very Slow

Post by ds_teg »

I have a job which will do the updates to a database table .The job design is as follows :

Sequential file -------> oracle enterprise ( update only mode ) .

Please check the log below


Occurred: 3:25:11 PM On date: 5/4/2010 Type: Control
Event: Starting Job jobname. (...)

Occurred: 3:25:12 PM On date: 5/4/2010 Type: Info
Event: Environment variable settings: (...)

Occurred: 3:25:12 PM On date: 5/4/2010 Type: Info
Event: Parallel job initiated

Occurred: 3:25:12 PM On date: 5/4/2010 Type: Info
Event: OSH script (...)

Occurred: 3:25:13 PM On date: 5/4/2010 Type: Info
Event: main_program: IBM WebSphere DataStage Enterprise Edition 8.0.1.5272 (...)

Occurred: 3:25:13 PM On date: 5/4/2010 Type: Info
Event: main_program: conductor uname: -s=AIX; -r=3; -v=5; -n=pedwbt01; -m=00C1C3A04C00

Occurred: 3:25:13 PM On date: 5/4/2010 Type: Info
Event: main_program: orchgeneral: loaded (...)

Occurred: 3:46:43 PM On date: 5/4/2010 Type: Info
Event: main_program: APT configuration file: /opt/IBM/InformationServer/Server/Configurations/configiguration6x6_prod_ods.apt (...)

Occurred: 4:02:19 PM On date: 5/4/2010 Type: Info
Event: SEQF_Load_Updates,0: Progress: 10 percent.

Occurred: 4:14:31 PM On date: 5/4/2010 Type: Info
Event: SEQF_Load_Updates,0: Progress: 20 percent.
Here if we observe the log there is a lot of delay between "3:25:13" and "3:46:43" . Can anyone has any idea what exactly datastage is doing during that time ? Its taking lot of time to read the sequential file .Previous it was running very fast .At datastage job level nothing has changed .

Can someone help me in resolving the same ?

Thanks
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

I would guess it is actually loading the data at that stage, flushes aren't always instantaneous. Are you monitoring the run at the same time? Is data flowing during that period? How large is the target table and what throughput are you getting?
ds_teg
Premium Member
Premium Member
Posts: 51
Joined: Tue Aug 11, 2009 6:53 am
Location: Chicago

Post by ds_teg »

No its not loading any data into the table .The volume that its processing is nearly 1 million.
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

By loading I meant doing the updates.

What about the target size and throughput? For one million rows and looking at over two hours to run the updates you may have a bit of locking contention. Make sure your data is partitioned by your update key
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What are your array size and transaction (rows/commit) size settings?

If the latter is 0, then Oracle is building and building and building a huge transaction in the redo tablespace (rollback segment, if you prefer). The larger this gets, the slower things tend to be. The solution to that is to commit more frequently.

Updates by their very nature will be slower, because the record needs to be found in each case (via the conditions in the WHERE clause), read, modified and written back - not to mention before and after images being written to the transaction log and index records also possibly needing to be updated.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'd also be curious if the fields used as Key fields for the update are indexed. If not, that's a classic reason why they might take 'longer and longer' - full table scans.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

I will suggest to load into a temporary table and then merge with your target.


Do you have any duplicates in the data ?
ds_teg
Premium Member
Premium Member
Posts: 51
Joined: Tue Aug 11, 2009 6:53 am
Location: Chicago

Post by ds_teg »

There is an index on the column which i have specified in the where clause of update query .

Commit row interval =1000

Commit time interval= 15

Could you please let me know how to find array size and transaction size so that i can provide the same details ,
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Those commit values are, in effect, your 'transaction size'. That being said, unless your DBAs have you cranked down pretty tight on redo/undo/rollback space, I would never commit that often for 'large' volumes.

There are two sides to every story and while Ray is (mostly) right about the use of a single end-game commit, going too far in the other direction - committing too often - can be just as bad. Me, I would either remove the commit time variable or set it way high (there's no need for it here) and then bump your row interval up significantly, either 10x or even 100x for that volume. As noted, however, you'd need to ensure with your DBA's help that there won't be any redo issues from that.

I still don't think would explain why previously with (I assume) the same volume it 'was running very fast' and now mysteriously it takes 'longer and longer'. Has your DBA attempted to help determine the cause? Traced the session and sql explain plan? Verified that the index is indeed being used for the updates? Simply having one doesn't always mean it will be used if say the stats are outdated, for example.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ds_teg
Premium Member
Premium Member
Posts: 51
Joined: Tue Aug 11, 2009 6:53 am
Location: Chicago

Post by ds_teg »

Hi Craig ,

I understand that we should not commit too often but here i am thinking that during the time that i have mentioned in the post , datastage is not doing anything and it just waiting for somthing to happen .I would you like to know what it is ? Here as mentioned by you the transaction size is small and its not commiting anything during that time .

Thanks
ds_teg
Premium Member
Premium Member
Posts: 51
Joined: Tue Aug 11, 2009 6:53 am
Location: Chicago

Post by ds_teg »

Sainath.Srinivasan wrote:I will suggest to load into a temporary table and then merge with your target.


Do you have any duplicates in the data ?
Hi Sai ,

Could you please explain more about the need of the temp table ?

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I wasn't addressing any specific period of time and what it may (or may not) be doing then, just your concerns regarding how long it was/is taking to do the actual work.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply