Hi
I want to do a lookup with a table having 50 million records ,
with the source flat file having 45 millon records
First i have implimented it trough a haded file ,the hashed file created out of the table -
It was very slow
Now I have implimented a direct orcale stage lookup-It is based on two keys -the keys are B tree indexed
Here also the performance is very poor ,look up speed is 280 rows per sec
Please let me know you suggestions to handle the scenario
Look up is vey slow
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Is the hashed file created/populated in the same job or in a different job? If it's created in the same job then rows/sec is meaningless because it includes the wait time while the hashed file is populated. Populate the hashed file in a separate job and be prepared to be amazed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
The job logic is as given below
Case 1:
Job1-Create hashed file from table of 60 million records (fetching around 45 million records based on condition )
Job2-Using the hashed file created from job for look up purpose
Job1 is taking long time to create hashed file,running 400 rows/sec
So I tried the below option
with out hashed file,Do a direct look up on indexed columns in the table..
Please suggest...
Case 1:
Job1-Create hashed file from table of 60 million records (fetching around 45 million records based on condition )
Job2-Using the hashed file created from job for look up purpose
Job1 is taking long time to create hashed file,running 400 rows/sec
So I tried the below option
with out hashed file,Do a direct look up on indexed columns in the table..
Please suggest...
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Here are the three main ideas for getting the best out of write performance with hashed files.
- Critically examine what rows and columns are loaded into the hashed file, and eliminate any that are not required.
Pre-size the hashed file using MINIMUM.MODULUS setting (get an approximate value from the current size of the hashed file, divided by the group size (2KB)).
If the hashed file is under 999MB consider maximizing your write cache buffer size and employing cached write.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.