Performance

clarcombe · Post by **clarcombe** » Wed May 16, 2007 4:04 am

It would seem that the overhead actually lies with writing to the table. You might want to see if there is a way of performing a bulk load to the table.

Additionally, check out increasing the array size and transaction size of the DB2 stage. These determine how much data is transferred in each read and how much is written to the table. Try increasing in multiples of 5000

Also you might want to stick an interprocess stage between the two too.

Nisusmage · Post by **Nisusmage** » Wed May 16, 2007 4:51 am

I have something similar.

I put a InterProcess Stage between the source and the LP and an InterProcess between the LC and the target. You should get more performance.

NOTE!: However,
1) I've noticed the the InterProcess is a little unstable. Keep your rows below 2million and you should be fine.
2) Check you hardware, do you have more than one processor and is it using them all otherwise you don't gain much than a time-splitting processing, mulithreaded job. More processors, means more processing.

JoshGeorge · Post by **JoshGeorge** » Wed May 16, 2007 5:08 am

Performance of link partitioner with multiple transformers depends on the number of CPUs of server. Use the inter-process (IPC) stage which provides a communication channel between processes running.

chulett · Post by **chulett** » Wed May 16, 2007 7:13 am

Plus the fact that 350,000 records really isn't large enough for most changes in job design to provide significant changes in runtime.

DSguru2B · Post by **DSguru2B** » Wed May 16, 2007 7:18 am

WHat is the array size you are using? If its 1 then try to increase that. Sometimes a simple change in array size can make all the difference.

ray.wurlod · Post by **ray.wurlod** » Wed May 16, 2007 3:50 pm

IPC, Link Partitioner and Link Collector have had some bad press here. If you're worried about that, use a Transformer stage to implement round robin partitioning (Mod(@INROWNUM,3) for 3 outputs). Enable inter process row buffering if you have two active stages, otherwise enable in process row buffering.

But I still suspect the major delays are in the database. Are there many indexes and constraints on the table? Have you tried bulk load?

asitagrawal · Post by **asitagrawal** » Sun May 27, 2007 6:54 pm

Even I agree with Ray's post...

Mod(@INROWNUM,3)

I am implementing a similar logic for partitioning the input data for my process...where I have to load apporx 30Million data in daily basis....

Its a real good approach... you must desgin your own logic to split the data into different sets and process them running multiple instance of your job...