When I have a lookup stage with reference link direct to Oracle stage what exactly happens in the background?
Does this simply create a temporay lookup fileset which is then loaded into memory in "entire" mode on all nodes? or Does it create partitioned lookup filesets? Or does it not really create lookup filesets at all?
If a lookup fileset is only required for one job (MainJob) and will be recreated just before the each run of MainJob, will I get the same performance by having a separate job that loads the lookup fileset and then MainJob uses that lookup fileset compared to the easyier implementation of having the lookup stage connected directly to the Oracle stage?
Lookup direct from Oracle database
Moderators: chulett, rschirm, roy
Lookup direct from Oracle database
Regards,
Nick.
Nick.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
i guess i was talking about the virtual dataset.
Forgetting the overhead in creating this virtual dataset or the overhead creating the lookup fileset, once you had them is there a large performance differnece between performing lookups on the virtual dataset rather than the lookup fileset?
1) With small amounts of lookup data < 100,000 rows
2) With large volumes in the lookup data > 10,000,000
Is the physical implementation of the oracle stage/virtual dataset and the lookup stage very different from the lookup fileset and the lookup stage?
Forgetting the overhead in creating this virtual dataset or the overhead creating the lookup fileset, once you had them is there a large performance differnece between performing lookups on the virtual dataset rather than the lookup fileset?
1) With small amounts of lookup data < 100,000 rows
2) With large volumes in the lookup data > 10,000,000
Is the physical implementation of the oracle stage/virtual dataset and the lookup stage very different from the lookup fileset and the lookup stage?
Regards,
Nick.
Nick.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
It's more to do with the number of distinct values that are likely to be looked up. If this is "small" (they have never quantified what this means), then a sparse lookup is likely to be beneficial.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.