19 Hashed file look up....

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

19 Hashed file look up....

Post by swades »

Hi all,
We have a job in that data stream apply to transformer and in Xmer 19 hashed file given for look up.among that 18 hashed file having same kind of meta data ,5 column ,all of it hash key.we comparing 3 of it and taking other 2 us look up output.
I think due to this logic my job is running slow.so what is solution should i apply.
Ans me thanks.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

I have recommended so many times to you to first look at the server resources before stating a job is "slow". Can you PLEASE state your version of Unix? Run prstat, topas, top, or glance and watch your job run. If the CPU for the job is at 100%, your job IS NOT SLOW. You just have a lot of logic and that requires CPU time. Your next steps to improve performance is to tune hashed files, and then incorporate multiple job instances and partition your data and use multi-processing techniques.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: 19 Hashed file look up....

Post by chulett »

swades wrote:Ans me thanks.
No reason to ask for an answer. :?

Pretty sure "we" all have had this conversation before. What ever happened to the resource checks Ken asked you to perform?

(ack, too slow. Waves as the Ken-mobile blows right on by)
-craig

"You can never have too many knives" -- Logan Nine Fingers
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

[quote="kcbland"] Can you PLEASE state your version of Unix? Run prstat, topas, top, or glance and watch your job run. If the CPU for the job is at 100%, your job IS NOT SLOW.
I have OS: Sun Solaris .
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Go to your DataStage server unix command line. Type in:

Code: Select all

uname -X
Get the number of cpus.
Then type in

Code: Select all

prstat -a
Look at the server utilization. Look at the processes running. Look at the top users listed at the bottom. Each DS process can only use 1/NumCPUS percentage. So 4 cpus means a FULL SPEED JOB will show as 25%.

Start your job running. Watch and see what the uvsh or phantom process achieves while your job is running. If other things are running, your ability to reach 25% will be limited.

Your goal is to get every type of job you write to use a full CPU. When you can reach a full CPU, you'll use partitioned parallel multiple instances to then run more job instances to fully use all CPUs.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Thanks Ken,
I analyze my job, there is 4 CPUS,
it never reach to 100% if you make total of 4.
It reaches only upto 50% (if u make total)
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

so what i suppose to do to utilize 100% CPUs.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

What about YOUR JOB! Does your job show it's using 25%? If it is, then the job is going as fast as ONE CPU can work.

If it isn't, then there's something interfering with your job. That could be a lack of cpu time. But, since there's 50% free cpu time, it's something else. That could be a disk issue preventing either: memory swapping, quick reference lookup, or writing to disk.

If your source stream is a datafile, there's no significant delay in reading the file. If writing to a file there's usually no significant delay in writing. If your source stream is a database, than your job is waiting on the database to send data. Didn't we just cover this in a previous post with you about writing data to a file instead of sending it directly into intense transformation?
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Supply and demand again.

Manage your expectations.

Try splitting the 19 lookups across, say, four transformer stages rather than have one as the bottleneck.

Next time you run the job, enable statistics on the Job Run Options dialog for that single transformer stage - learn where it is spending most of its time.

Are the hashed files read-cached?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply