Could You help me in the following issue in parrelel job
I have .
file A contains 1 millon records
file B,contains 1000 records.
Oracle Table C contains 1000 records
I want to select some columns from A,B and C based on a condition
A.id=B.id and B.id=C.id
1.Could you suggest the best way of choosing lookups ?
2.Is there any performance bottlenecks when we choose heterogenious lookups ( 1 in Flat file and 1 in oracle)
3.Is there any advatage of chhoosing lookup_file set as lookup/s?
4.There will be some chance of addition of new records to oracle table
how can i make the changes visible in look up automatically.
look ups in parrelel jobs
Moderators: chulett, rschirm, roy
Hi,
You can use the File B and the oracle stag as reference links for lookup. Less volume data should be chosed as reference data. Because this will be loaded in to the memory during lookup.
Yes you can chose the lookup file set for the file B, which can improve the performance. Since the oracle is slowly changing, if the lookup file set is use, then it should be updated when ever the data is changed. Else it can be used directly.
-Kumar
You can use the File B and the oracle stag as reference links for lookup. Less volume data should be chosed as reference data. Because this will be loaded in to the memory during lookup.
Yes you can chose the lookup file set for the file B, which can improve the performance. Since the oracle is slowly changing, if the lookup file set is use, then it should be updated when ever the data is changed. Else it can be used directly.
-Kumar
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
Anything under 10,000 lookup reference rows is quite small and will fit into memory so no need to go to the effort of preloading lookup filesets or datasets, just turn your 1000 row sources into straight lookup sources. Lookup stage should be faster then the join or merge options.
Very tricky to modify your Oracle lookup while you use it. On the one hand with a normal lookup the entire 1000 rows is loaded into memory at the job startup time so any changes to the Oracle table will be ignored. Even with sparse lookups it's tricky due to the speed of the nodes at adding rows.
In Hawk there is something called up updatable in memory lookup that will help you but in the current release it's just too difficult. You are better off putting all your Oracle lookup inserts into a dataset (via a remove duplicates stage) and then loading it into the Oracle table after your job has finished.
Very tricky to modify your Oracle lookup while you use it. On the one hand with a normal lookup the entire 1000 rows is loaded into memory at the job startup time so any changes to the Oracle table will be ignored. Even with sparse lookups it's tricky due to the speed of the nodes at adding rows.
In Hawk there is something called up updatable in memory lookup that will help you but in the current release it's just too difficult. You are better off putting all your Oracle lookup inserts into a dataset (via a remove duplicates stage) and then loading it into the Oracle table after your job has finished.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn