Performance

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
clarcombe
Premium Member
Premium Member
Posts: 515
Joined: Wed Jun 08, 2005 9:54 am
Location: Europe

Post by clarcombe »

It would seem that the overhead actually lies with writing to the table. You might want to see if there is a way of performing a bulk load to the table.

Additionally, check out increasing the array size and transaction size of the DB2 stage. These determine how much data is transferred in each read and how much is written to the table. Try increasing in multiples of 5000

Also you might want to stick an interprocess stage between the two too.
Colin Larcombe
-------------------

Certified IBM Infosphere Datastage Developer
Nisusmage
Premium Member
Premium Member
Posts: 103
Joined: Mon May 07, 2007 1:57 am

Post by Nisusmage »

I have something similar.

I put a InterProcess Stage between the source and the LP and an InterProcess between the LC and the target. You should get more performance.

NOTE!: However,
1) I've noticed the the InterProcess is a little unstable. Keep your rows below 2million and you should be fine.
2) Check you hardware, do you have more than one processor and is it using them all otherwise you don't gain much than a time-splitting processing, mulithreaded job. More processors, means more processing.
~The simpliest solutions are always the best~
~Trick is to understand the complexity to implement simplicity~
JoshGeorge
Participant
Posts: 612
Joined: Thu May 03, 2007 4:59 am
Location: Melbourne

Post by JoshGeorge »

Performance of link partitioner with multiple transformers depends on the number of CPUs of server. Use the inter-process (IPC) stage which provides a communication channel between processes running.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Plus the fact that 350,000 records really isn't large enough for most changes in job design to provide significant changes in runtime.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

WHat is the array size you are using? If its 1 then try to increase that. Sometimes a simple change in array size can make all the difference.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

IPC, Link Partitioner and Link Collector have had some bad press here. If you're worried about that, use a Transformer stage to implement round robin partitioning (Mod(@INROWNUM,3) for 3 outputs). Enable inter process row buffering if you have two active stages, otherwise enable in process row buffering.

But I still suspect the major delays are in the database. Are there many indexes and constraints on the table? Have you tried bulk load?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
asitagrawal
Premium Member
Premium Member
Posts: 273
Joined: Wed Oct 18, 2006 12:20 pm
Location: Porto

Post by asitagrawal »

Even I agree with Ray's post...
Mod(@INROWNUM,3)
I am implementing a similar logic for partitioning the input data for my process...where I have to load apporx 30Million data in daily basis....

Its a real good approach... you must desgin your own logic to split the data into different sets and process them running multiple instance of your job...
Share to Learn, and Learn to Share.
Post Reply