19 Hashed file look up....
Moderators: chulett, rschirm, roy
19 Hashed file look up....
Hi all,
We have a job in that data stream apply to transformer and in Xmer 19 hashed file given for look up.among that 18 hashed file having same kind of meta data ,5 column ,all of it hash key.we comparing 3 of it and taking other 2 us look up output.
I think due to this logic my job is running slow.so what is solution should i apply.
Ans me thanks.
We have a job in that data stream apply to transformer and in Xmer 19 hashed file given for look up.among that 18 hashed file having same kind of meta data ,5 column ,all of it hash key.we comparing 3 of it and taking other 2 us look up output.
I think due to this logic my job is running slow.so what is solution should i apply.
Ans me thanks.
I have recommended so many times to you to first look at the server resources before stating a job is "slow". Can you PLEASE state your version of Unix? Run prstat, topas, top, or glance and watch your job run. If the CPU for the job is at 100%, your job IS NOT SLOW. You just have a lot of logic and that requires CPU time. Your next steps to improve performance is to tune hashed files, and then incorporate multiple job instances and partition your data and use multi-processing techniques.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Re: 19 Hashed file look up....
No reason to ask for an answer.swades wrote:Ans me thanks.
Pretty sure "we" all have had this conversation before. What ever happened to the resource checks Ken asked you to perform?
(ack, too slow. Waves as the Ken-mobile blows right on by)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Go to your DataStage server unix command line. Type in:
Get the number of cpus.
Then type in
Look at the server utilization. Look at the processes running. Look at the top users listed at the bottom. Each DS process can only use 1/NumCPUS percentage. So 4 cpus means a FULL SPEED JOB will show as 25%.
Start your job running. Watch and see what the uvsh or phantom process achieves while your job is running. If other things are running, your ability to reach 25% will be limited.
Your goal is to get every type of job you write to use a full CPU. When you can reach a full CPU, you'll use partitioned parallel multiple instances to then run more job instances to fully use all CPUs.
Code: Select all
uname -X
Then type in
Code: Select all
prstat -a
Start your job running. Watch and see what the uvsh or phantom process achieves while your job is running. If other things are running, your ability to reach 25% will be limited.
Your goal is to get every type of job you write to use a full CPU. When you can reach a full CPU, you'll use partitioned parallel multiple instances to then run more job instances to fully use all CPUs.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
What about YOUR JOB! Does your job show it's using 25%? If it is, then the job is going as fast as ONE CPU can work.
If it isn't, then there's something interfering with your job. That could be a lack of cpu time. But, since there's 50% free cpu time, it's something else. That could be a disk issue preventing either: memory swapping, quick reference lookup, or writing to disk.
If your source stream is a datafile, there's no significant delay in reading the file. If writing to a file there's usually no significant delay in writing. If your source stream is a database, than your job is waiting on the database to send data. Didn't we just cover this in a previous post with you about writing data to a file instead of sending it directly into intense transformation?
If it isn't, then there's something interfering with your job. That could be a lack of cpu time. But, since there's 50% free cpu time, it's something else. That could be a disk issue preventing either: memory swapping, quick reference lookup, or writing to disk.
If your source stream is a datafile, there's no significant delay in reading the file. If writing to a file there's usually no significant delay in writing. If your source stream is a database, than your job is waiting on the database to send data. Didn't we just cover this in a previous post with you about writing data to a file instead of sending it directly into intense transformation?
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Supply and demand again.
Manage your expectations.
Try splitting the 19 lookups across, say, four transformer stages rather than have one as the bottleneck.
Next time you run the job, enable statistics on the Job Run Options dialog for that single transformer stage - learn where it is spending most of its time.
Are the hashed files read-cached?
Manage your expectations.
Try splitting the 19 lookups across, say, four transformer stages rather than have one as the bottleneck.
Next time you run the job, enable statistics on the Job Run Options dialog for that single transformer stage - learn where it is spending most of its time.
Are the hashed files read-cached?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.