Job optimization

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
urshit_1983
Participant
Posts: 73
Joined: Wed Jun 28, 2006 3:27 pm
Location: NJ

Job optimization

Post by urshit_1983 »

Hi ,

I am pulling data from Oracle using transformer to change i/p and load it in to a hash file. Transformer has many IF THEN ELSE condition used. I tried to optimize the job using stage variables and inter process row buffering. But still not satisfied. Plz let me know what else i need to change in Hash file. I am using default type 30 dynamic file.

Thanks
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Welcome aboard. It's absolutely impossible to tell if your Oracle query is slow, the hashed file is improperly sized, or your transformation logic is slow. We also don't know how many rows of data you are processing.

Breakup your job and write your Oracle results just to a text file with no transformation and then you'll see how fast the data will flow from Oracle.

Transform the text file and write to another text file and you'll see how fast your transformation logic can fly.

Then, take the resulting text file and just load it into a hashed file and you'll see how fast DataStage can put your data into a hashed file.

Whichever of these three jobs is the slowest is the one that has the bottleneck. You'll probably see that it's the Oracle piece, but if not, then it's probably the transform piece.

Since your design is basically single threaded, you'll find you have an opportunity to take the middle transformation job and run multiple job instances and use every single cpu on your server. These multi-processing concepts are discussed here so have fun reading.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
urshit_1983
Participant
Posts: 73
Joined: Wed Jun 28, 2006 3:27 pm
Location: NJ

Post by urshit_1983 »

Hey Kenn,

yes i found out its Oracle stage causing prob. What shall I do now? could you guide me out.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I very much doubt that it's the Oracle stage that's causing the problem; it's far more likely that it's how you've constructed the SQL query. Are you using generated SQL, user-defined SQL or a stored procedure to extract your rows? Can you post the SQL so we can suggest how it might be improved?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
urshit_1983
Participant
Posts: 73
Joined: Wed Jun 28, 2006 3:27 pm
Location: NJ

Post by urshit_1983 »

Hi Ray,

I am using generated SQL and user defined as well the where clause and order by clause to sort data before changing in tx.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The WHERE causes the database to discriminate or search for data, which may or may not use indexes or partition elimination to more readily find the result rows.

The order by causes the entire query to completely execute and gather all result rows before sending anything to the requester, in this case a DS job. At that time, it's simply a matter of DS catching the rows as they traverse the database and network on its way over to the DS process.

Why sort the data? Once it goes into the hashed file it's row order is lost. Are you doing something that requires ordering?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply