Performance tuning in Server Job (part of a sequence)

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
DSShishya
Premium Member
Premium Member
Posts: 37
Joined: Tue Oct 27, 2009 9:43 pm

Performance tuning in Server Job (part of a sequence)

Post by DSShishya »

Dear all,

I'm trying to do performance tuning for a Job Sequence which has about 25 SERVER jobs mainly involving 3 types of jobs (Version 8.1)

1.Conversion
2.Look up
3.Load

First I ran the sequence and made a note of the stats (Time taken) for each job to complete, I thought in order to improve the performance first it would be less invasive to tweek the resources and enabled In Process Row buffering especially for the lookup jobs containing Hash files, there was an improvement by about 25% (for the individual jobs) as far as run time is concerned.

My next step was I looked into a LOAD job of the sequence which has 3 stages Seq File, Transformer and ORAOCI

The job is pretty simple which just transfers data from the Seq File to the ORacle DB (Seq File --> XFM --> ORAOCI) with 3 columns added in the transformer with very simple transformations, no stage variables have been used.

I ran this job for a file of about 2 Million records and it took almost 2 hours for it to complete.

I then added two IPC stages in this job one in between the Seq File and the Transformer and the other in between the Transformer and the ORAOCI (Seq File --> IPC --> XFM --> IPC --> ORAOCI)

I ran the job now and WOW!!!!.... The Job Finished in about 7 mins as opposed to around 2 Hrs earlier.

But there are two problems which I encountered.

1. In the ORAOCI stage the condition given is "Clear table and then insert".... But when I tried to run the modified job again the job aborted after loading initial few rows, I suspect because the table was not cleared of the earlier data, so by adding the IPC stage before the ORAOCI stage did it create any kind of a problem for the table to be cleared? (It later worked when I manually cleared the table)

2. When I tried to run entire sequence (after clearing the table) with this modified job in it, the sequence aborted when it came to this job (There is a lookup job before this job), because in the director it said that this modified job aborted, is it because of the "Time Out" factor in the IPC stage?

If any of you DS Pundits could help me on this I would really appreciate it.

Also if you could let me know if there are any risk factors associated with using the IPC stage (Do's and Dont's of IPC) and typically what kind of performance tuning methods should be used for a Look UP and Load Jobs (SERVER JOBS) that would be great.

Sorry for the long story but I had to explain in detail to give a clear picture, Any help would be highly appreciated.

Regards
Abhijeet1980
Participant
Posts: 81
Joined: Tue Aug 15, 2006 8:31 am
Location: Zürich
Contact:

Re: Performance tuning in Server Job (part of a sequence)

Post by Abhijeet1980 »

Hello DSShishya,

This was more about the tuning of individual jobs rather than the sequence job (Green in color).

Job sequence is more of control and sequencing of the jobs.

As far the analysis to proceed, it is necessary to find out why the job (not the sequence but the exact job) took 2 hrs (now 7 mins).

. Writing to Hash file.
. Extraction from a source.
. Writing to a target.
. details of components used etc etc.

-Abhijit
DSShishya
Premium Member
Premium Member
Posts: 37
Joined: Tue Oct 27, 2009 9:43 pm

Re: Performance tuning in Server Job (part of a sequence)

Post by DSShishya »

Thanks Abhijit, I will do more analysis on this....

Can you also tell me if IPC stage should be used ONLY in between two active stages or is it fine to use IPC Stage in between a Active and Passive stage?

Also (This may sound stupid), but by using IPC in between a active and a passive stage(Target DB) will there be any loss of data or will the transformations not occur or any rows?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:!: Please don't think that IPC is some kind of magical Silver Bullet such that you can run around and shoot every job with impunity as if you were infested with werewolves. Out of the thousands of Server jobs I've built over the years, probably only a handful actually "needed" IPC.

Our good friend Ernie Ostic has posted about this numerous times, I've pulled some example links for you. Please read what he posts in them carefully and pay good heed to his advice - he knows of what he speaks:

viewtopic.php?t=129621
viewtopic.php?t=109607
viewtopic.php?t=124574
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSShishya
Premium Member
Premium Member
Posts: 37
Joined: Tue Oct 27, 2009 9:43 pm

Post by DSShishya »

Thanks Craig!! Those links helped a lot and gave me lot more idea about IPC stage.

As per Ernie's suggestion there is a chance of IPC stage changing the logic and hence giving the wrong output (lookup scenario), but one question that comes to my mind is how come this does not apply to a lookup in a parallel job, after all the IPC stage only implements parallelism in the server job, right?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

PX is a completely different beast and built from the ground up for parallelism. Not that you can't still get into trouble with PX and partitioning but it's really an "apples to oranges" comparison to the kind of parallel processing you can make a Server job do. IMHO.

Perhaps Ernie or Ray will stop by and throw their two cents on the table.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply