Datastage 8.0.1 optimnization

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
friend.kak@gmail.com
Participant
Posts: 28
Joined: Sat May 03, 2008 3:57 am
Location: chennai

Datastage 8.0.1 optimnization

Post by friend.kak@gmail.com »

Hi,
We have a requirement from the client to fine tune the jobs that are running for long time in production.

Environment details in brief:

Version : IBM WebSphere Datastage 8.0.1 (Sun OS 5.10)
Sources : Teradata,Oracle,Flat files Target : Teradata
Transformation rules : Complex
Nodes : 1 nodes in production ( config file)

From the observations, ETL tool, Datastage is only used when the business rules / functionality cannot be implemented using Teradata's SQL or via BTEQ.Loading is implemented through Datastage only with TD Multiload , TD Enterprise.

Please post your high points on the same which can help to fine tune the jobs.


Thanks, friend.kak@gmail.com
- Dev
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

When you run your job is the CPU use at over 90% for long periods of time, or is I/O overloaded or does the machine swap? Without knowing where your bottleneck is one cannot recommend tuning.
Although a 1-node configuration doesn't give PX a chance to take advantage of any parallelism whatsoever.
Last edited by ArndW on Tue Jul 07, 2009 11:16 pm, edited 1 time in total.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

A one node DataStage configuration for a Teradata environment is insane. Like feeding an elephant with a teaspoon. I am guessing you have performance problems in your ETL and in the ELT Teradata steps. You might be using the wrong Teradata load options - it often depends on incoming volume as to the best load method. You will find it hard to tune those BTEQ scripts. You need to get yourself a *DataStage* expert who has solution architecture knowledge, not a *Teradata* expert but someone who understands how to balance load across ETL and ELT. Turn your DataStage steps into a parallel architecture with a lot of grunt, rewrite the worst BTEQ steps into DataStage steps, move the heavy lifting off Teradata and onto a massively parallel DataStage cluster. Review how data is pushed into Teradata - have a look at the blog post from Joshy George IBM Information Server / Datastage Enterprise Edition with Teradata.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Vincent - you were so much more direct than I - do you think I was being too subtle regarding a 1-node configuration :lol: Compare my
1-node configuration doesn't give PX a chance to take advantage
with your somewhat more forceful
A one node DataStage configuration for a Teradata environment is insane
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I did like "feeding an elephant with a teaspoon"!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply