analyze.shm
Moderators: chulett, rschirm, roy
Just as a test, try to split the job into two.
You can write to a text file, from Link collector stage in Job 1.
Use the text file as source and process rest of the stages in Job 2.
By doing this, you can identify the area where the problem originates. I believe that this has more to do than the way the Hashed file is organized.
You can write to a text file, from Link collector stage in Job 1.
Use the text file as source and process rest of the stages in Job 2.
By doing this, you can identify the area where the problem originates. I believe that this has more to do than the way the Hashed file is organized.
Just as a test, try to split the job into two.
You can write to a text file, from Link collector stage in Job 1.
Use the text file as source and process rest of the stages in Job 2.
By doing this, you can identify the area where the problem originates. I believe that this has more to do than the way the Hashed file is organized.
You can write to a text file, from Link collector stage in Job 1.
Use the text file as source and process rest of the stages in Job 2.
By doing this, you can identify the area where the problem originates. I believe that this has more to do than the way the Hashed file is organized.
Attu,
What is your source? Database/File?
What is the size of your hashed file in question?
You would want to resize the hashed file only if you suspect you are exceeding the 2 GB size limit.
Also I would not preload to memory if the size of the hashed file is very large.
If your source is not a file, you can try what srinagesh is suggesting to help identify the bottleneck.
What is your source? Database/File?
What is the size of your hashed file in question?
You would want to resize the hashed file only if you suspect you are exceeding the 2 GB size limit.
Also I would not preload to memory if the size of the hashed file is very large.
If your source is not a file, you can try what srinagesh is suggesting to help identify the bottleneck.
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
Finding answers is simple, all you need to do is come up with the correct questions.
[quote="srinagesh"]Just as a test, try to split the job into two.
You can write to a text file, from Link collector stage in Job 1.
Use the text file as source and process rest of the stages in Job 2.
Code: Select all
yes i splitted the job in two parts, the first part was very fast and it had 2 hashed files doing a lookup, but the second part again slowed down.
Use the text file as source and process rest of the stages in Job 2.
Code: Select all
I have done that, the performance of the second part is very poor.
hi Narasimha,
[quote="narasimha"]Attu,
What is your source? Database/File?
What is the size of your hashed file in question?
You would want to resize the hashed file only if you suspect you are exceeding the 2 GB size limit.
Also I would not preload to memory if the size of the hashed file is very large.
[quote="narasimha"]Attu,
What is your source? Database/File?
Code: Select all
File with 50 million records
Code: Select all
HF1 32661504 bytes
HF2 700416 bytes
HF3 212992 bytes
HF4 10645504 bytes
HF5 1032192 bytes
Code: Select all
okay
Code: Select all
i tried that and no success
I would like to run the same job in a different environment. I already exported the dsx. How do I move hashed files to the other Environment?
Can I just do a Unix copy of hashed file or there is any import/export command for moving hashed files to different servers?
Appreciate your responses.
Thanks
Can I just do a Unix copy of hashed file or there is any import/export command for moving hashed files to different servers?
Appreciate your responses.
Thanks
Yes you will have to copy the hashed files to the new location. There is no import/export utility for this purpose.
From what I see, your hashed files are not verylarge.
Not sure why the performance is so poor.
I would try and use the default options while creating these hashed files and check the performance.
From what I see, your hashed files are not verylarge.
Not sure why the performance is so poor.
I would try and use the default options while creating these hashed files and check the performance.
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
Finding answers is simple, all you need to do is come up with the correct questions.
Depends on if they are "pathed" hashed files or were created in an account. For the former, yes you can simple copy them over to the new server using your tool of choice. Make sure you get everything, including the hidden file for a dynamic (Type30) hashed file and the "D_" dictionary file if present.
For the latter, you can still copy them but you will need to handle the VOC record if the hashed files don't already exist on the new server. For that you'll manually need to create it using SETFILE.
For the latter, you can still copy them but you will need to handle the VOC record if the hashed files don't already exist on the new server. For that you'll manually need to create it using SETFILE.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Can you graphically outline the second job.
Try to move the Hashedfile 4 lookup into the First job and check the performance. Move one transformation after the other from Job 2 into Job 1 and at one point you will notice a deterioration in performance of Job 1. This is the bottleneck that you are looking for.
Try to move the Hashedfile 4 lookup into the First job and check the performance. Move one transformation after the other from Job 2 into Job 1 and at one point you will notice a deterioration in performance of Job 1. This is the bottleneck that you are looking for.
Okay, I was able to run the job on a different server and it completed successfully. We did not break it into pieces and ran it as it was. the throughput was around 1200 rows/sec, better than what we had on original server ( 12 rows/sec).
It seemed that our server was overloaded and too many processes were running utilizing tremendous amount of cpu cycles.
I just want to know what is the best practice to run jobs having Link Collector, multiple hashed file doing lookup and lots of transformers ?
Thanks for the responses.
It seemed that our server was overloaded and too many processes were running utilizing tremendous amount of cpu cycles.
I just want to know what is the best practice to run jobs having Link Collector, multiple hashed file doing lookup and lots of transformers ?
Thanks for the responses.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There's a rule of thumb that suggests a maximum of four hashed file lookups per Transformer stage.
Such a job can usually benefit from inter-process row buffering.
I'm not aware of any "best practices" relating to Link Collector stage apart from don't use it if you don't need to. For example you do need to if you're writing to a sequential file, but you don't need to if you're inserting new rows into a database table (and the keys are correctly partitioned).
Such a job can usually benefit from inter-process row buffering.
I'm not aware of any "best practices" relating to Link Collector stage apart from don't use it if you don't need to. For example you do need to if you're writing to a sequential file, but you don't need to if you're inserting new rows into a database table (and the keys are correctly partitioned).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.