Performance issue with Hashfiles

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
pradkumar
Charter Member
Charter Member
Posts: 393
Joined: Wed Oct 18, 2006 1:09 pm

Performance issue with Hashfiles

Post by pradkumar »

Hi,

I migrated the project from 7.5 version on HP Unix machine ( 2-CPU) to version 8.0.1 on AIX machine(1 cpu) and moved all the files to the new server. I didnt copy the hashfiles from old server to new server. what i did was i ran all the jobs where the hashfiles are created in the project directory of new server.
when i try to run the remaining jobs, the perfromance is very slow when compare to old server. For example some jobs in 7.5 version ends in 50mins and its taking more than 3 hrs to run in version 8. Even the number of rows per sec in 7.5 is 60 where as in version 8 is 5rows/sec .
I checked the cpu utilization and its almost 90 to 95% untilized. Is this is due to Number of CPUS? how can i increase the performance of the jobs(number of rows from hash file to transformer) ?? .

Any inputs would be really appreciated.

Thanks in Advance
tcj
Premium Member
Premium Member
Posts: 98
Joined: Tue Sep 07, 2004 6:57 pm
Location: QLD, Australia
Contact:

Post by tcj »

Are they dynamic or static hashed files?

How many rows are being created in the hashed files?

I would guess that the hash files on the old 7.5 server have grown over time being dynamic type hashed files. The new hash files on version 8 will probably have been created from new with the default settings. The hashed file will start splitting as the hashed file grows which can cause major over heads.
tcj
Premium Member
Premium Member
Posts: 98
Joined: Tue Sep 07, 2004 6:57 pm
Location: QLD, Australia
Contact:

Post by tcj »

Interesting whitepaper on the subject. I found this link from other post.

http://www.openqm.org/downloads/dynamic_files.pdf

This is the post I found it in. Take note Chulett post down the bottom.

viewtopic.php?t=109212&highlight=modulus
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: Performance issue with Hashfiles

Post by ray.wurlod »

pradkumar wrote:Hhow can i increase the performance of the jobs(number of rows from hash file to transformer) ??
The only way that you can increase the number of rows is by increasing the number of rows (that is, by reading more rows).

The execution environment for version 8 is somewhat different to the execution environment for earlier versions, not least because of the fact that things are running in the Information Server domain. Doubtless there are overheads from this, particularly at the metadata management level.

A cynic might observe that they've put in some "slowdowns" for server jobs to encourage everyone to move to parallel jobs. Whether or not this is the case, IBM has been warning about the overheads of executing within the Information Server environment since version 8 was in beta testing. You might like to ask that question of your official support provider.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply