Hi,
We need to look up tables from Oracle, there are two methods -
1. Either we should use direct database table to look up and fetch
2. Load the look up data set files from table and Look up on these files to fetch.
Which one is better?
Look up file set Vs Actual Database lookup
Moderators: chulett, rschirm, roy
Look up file set Vs Actual Database lookup
Thanks,
Yuva.
Yuva.
-
- Participant
- Posts: 135
- Joined: Tue Aug 14, 2007 4:27 am
- Location: Mumbai
Its based on data volume of input VS reference and the your system configuation.
If input volume is significantly smaller then reference table then go far sparce lookup. ( one SQL query for one record )
if both input are same then go and still the volume is not huge ( based on ur system configuration and apt used) go for direct table lookup.
If both Input and reference are coming from TABLE then you can go for SQL join as well.
Else use lookup file set or JOIN stage ...
You Decide
![Cool 8)](./images/smilies/icon_cool.gif)
If input volume is significantly smaller then reference table then go far sparce lookup. ( one SQL query for one record )
if both input are same then go and still the volume is not huge ( based on ur system configuration and apt used) go for direct table lookup.
If both Input and reference are coming from TABLE then you can go for SQL join as well.
Else use lookup file set or JOIN stage ...
You Decide
![Cool 8)](./images/smilies/icon_cool.gif)
![Cool 8)](./images/smilies/icon_cool.gif)
![Cool 8)](./images/smilies/icon_cool.gif)
Thanks
Swapnil
"Whenever you find whole world against you just turn around and Lead the world"
Swapnil
"Whenever you find whole world against you just turn around and Lead the world"
If your lookup data changes once a month, then it makes sense to create a lookup fileset and use that on each run. If the reference data changes daily or each run of the job you can save the resources required to build the lookup and do it directly. This is just one of several choices you can make.
The goals of minimizing runtime as well as minimizing resource use are usually diametrically opposed. That is similar to expect MPG use on a car to get better with faster speeds - it doesn't work that way.
Mimizing resource use per row processed is a possible goal, but won't necessarily result in maximizing throughput.
The goals of minimizing runtime as well as minimizing resource use are usually diametrically opposed. That is similar to expect MPG use on a car to get better with faster speeds - it doesn't work that way.
Mimizing resource use per row processed is a possible goal, but won't necessarily result in maximizing throughput.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
My recollection, when I had a look at the files getting created when the job runs, is that the job runs the same way using both methods but the job startup will be different. If you already have a lookup fileset the job will start processing straight away. If you don't have a lookup fileset and the lookup data cannot all fit into RAM memory the job will create the equivalent of lookup filesets for you in the node scratch space. So for very large lookups either way you get a lookup fileset - it just depends on whether you want to create it every time you run the job or create it less often to a schedule.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn