Performance problem even after tuning...URGENT

deepak_b73 · Post by **deepak_b73** » Tue Apr 14, 2009 6:32 am

Hi,

We are facing perfromance probem in datastage.Problem is different here.
We have done all required optimization at job level. After tuning, job is taking hour to complete. Our production setup is as below.

Server : IBM P Series ( model 70)
OS : AIX 5.3
Datastage version: Enterprise edition 7.5.1

Server as 3 processors assigned. With this setu job was taking 1 hour.
I have increased processing capacity of the server by adding more processors and now it has 5 processors. But still job is taking 1 hour to complete. We checked CPU utilization when job was executing. 50% of CPU capacity was free.Still then Datastage was not using available free CPU capacity to increase performance. Why is it not taking free capacity?IS there some setting needs to be done at OS level or Datastage level to enable usage of full CPU capacity? Please help me.

Regards,
Deepak

sjfearnside · Post by **sjfearnside** » Tue Apr 14, 2009 7:06 am

Here are two other factors to look at:

1. Memory usage

2. I/O through put

Mike · Post by **Mike** » Tue Apr 14, 2009 7:23 am

You posted a server job type in the EE forum. Is it really a server job?

You have to design in most parallelism with server jobs.

Mike

deepak_b73 · Post by **deepak_b73** » Tue Apr 14, 2009 7:30 am

Hi sjfearinside,

Thanks for your response.We checked memory usage.It is under control.
Wen you say memory usage do you mean DS project folder or data folder?

I don't know how to check I/O throughput. Please let me know hw to check it?

Regards,
Deepak

deepak_b73 · Post by **deepak_b73** » Tue Apr 14, 2009 7:35 am

Mike,

I have posted it in this forum since datastage version used by us is Enterprise Edition. But we are using server type and created server jobs as earlier we were using Datastage server edition. We are not using parallel job type. That is why I have posted it here.

Regards,
Deepak

Mike · Post by **Mike** » Tue Apr 14, 2009 7:44 am

Ok. If you want a server job to use more processors, then you have to design it to use more processors.

Mike

chulett · Post by **chulett** » Tue Apr 14, 2009 7:48 am

deepak_b73 wrote:I have posted it in this forum since datastage version used by us is Enterprise Edition. But we are using server type and created server jobs as earlier we were using Datastage server edition. We are not using parallel job type. That is why I have posted it here.

Well... for the record, there's one forum here for Parallel jobs and one here for Server jobs regardless of the 'Edition' you are using. Granted, renaming them to be more in-line with IBM's branding confuses things, but that was the intent here.

Parallel job questions in the EE/PX forum.
Server job questions in the Server forum.
Sequence / batch / generic questions in the General forum.

Just as an FYI.

chulett · Post by **chulett** » Tue Apr 14, 2009 7:52 am

Adding more CPUs to a server doesn't magically mean a single job will use more of them, simply that more jobs could run at the same time. And if you want tips on tuning your job design, you'd need to post it first so we have some idea what it is doing.

sjfearnside · Post by **sjfearnside** » Tue Apr 14, 2009 7:57 am

deepak_b73 wrote:Hi sjfearinside,

Thanks for your response.We checked memory usage.It is under control.
Wen you say memory usage do you mean DS project folder or data folder?

I don't know how to check I/O throughput. Please let me know hw to check it?

Regards,
Deepak

Talk to your systech people that support the platform, such as AIX, Linux, ... They usually have tools at their disposal to monitor the I/O thru put. Have them monitor the I/O, memory and CPU usage when the job is executing to get a reading on the resource utilization. This may provide a clue to where the problem occurs.

ray.wurlod · Post by **ray.wurlod** » Tue Apr 14, 2009 1:33 pm

Use stage tracing (from the Tracing tab of the Job Run Options dialog) to capture statistics about the Transformer stage(s). Since you have provided no information about the job design there's not really much advice we can sensibly give.

deepak_b73 · Post by **deepak_b73** » Thu Apr 16, 2009 7:33 am

Hi,

Thanks to all of you for your replies. Job design is pretty simple. It has a sequential file stage. from there data is red to tranformer. It has 3 lookups with each lookup having data not exceeding more than 200 rows.
Then from transformer data is loaded into oracle table using oracle bulk load stage. Source data is 30 million records. Transformer just has trim for all coulmns and one nullto zero function put on numerical column and one simple constrant which filters data where one column value is greater than 0. This job takes one hour. even after increasing procesing capacity, performance is same.

Regards,
Deepak

ray.wurlod · Post by **ray.wurlod** » Thu Apr 16, 2009 4:38 pm

Use stage tracing (from the Tracing tab of the Job Run Options dialog) to capture statistics about the Transformer stage(s). Since you have provided no information about the job design there's not really much advice we can sensibly give. One solution that may be suggested by the collected statistics is to use more than one Transformer stage.

ray.wurlod · Post by **ray.wurlod** » Thu Apr 16, 2009 4:40 pm

Use stage tracing (from the Tracing tab of the Job Run Options dialog) to capture statistics about the Transformer stage(s). Since you have provided no information about the job design there's not really much advice we can sensibly give. One solution that may be suggested by the collected statistics is to use more than one Transformer stage.

martind · Post by **martind** » Thu Apr 16, 2009 10:23 pm

Can you replicate the problem on the development server?

Could you make a copy of the job in your production environment and change the Oracle output into a sequential file output and see how fast it runs?

Do you have any indexes, constraints, triggers on your target table?

Kryt0n · Post by **Kryt0n** » Thu Apr 16, 2009 10:53 pm

deepak_b73 wrote:Hi,

Thanks to all of you for your replies. Job design is pretty simple. It has a sequential file stage. from there data is red to tranformer. It has 3 lookups with each lookup having data not exceeding more than 200 rows.
Then from transformer data is loaded into oracle table using oracle bulk load stage. Source data is 30 million records. Transformer just has trim for all coulmns and one nullto zero function put on numerical column and one simple constrant which filters data where one column value is greater than 0. This job takes one hour. even after increasing procesing capacity, performance is same.

Regards,
Deepak

Are those lookups going to the database? Or have you downloaded to hashed files? (can't remember if you can control whether Server lookups go to the DB every time or it downloads the contents to internal storage...) Can always try move them to hashed files as a try (assuming you haven't)

Does every row have to perform all three lookups or only certain rows? Throw constraints on the lookups if possible (since you are currently doing 90 million lookups)