Sparse Lookup with Teradata Connector stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 64
- Joined: Fri Jan 25, 2008 4:41 am
Sparse Lookup with Teradata Connector stage
My Questions are
1) Can Teradata Connector stage be used for sparse lookup? if yes any example would be great.
2) In Server Job can we use Teradata API Stage to do sparse lookup?
3) In Teradata Connector stage what are good practices on the usage of Parallel Synchronization Section.
1) Can Teradata Connector stage be used for sparse lookup? if yes any example would be great.
2) In Server Job can we use Teradata API Stage to do sparse lookup?
3) In Teradata Connector stage what are good practices on the usage of Parallel Synchronization Section.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
1) No. Only DB2 and Oracle are supported for sparse lookup.
[Edit: in version 7.5]
2) Yes. It's probably slow.
3) Experiment to find what settings work best for you. There are too many variables to give a "one size fits all" answer, not least the size and volume of the data you wish to process.
[Edit: in version 7.5]
2) Yes. It's probably slow.
3) Experiment to find what settings work best for you. There are too many variables to give a "one size fits all" answer, not least the size and volume of the data you wish to process.
Last edited by ray.wurlod on Fri Aug 15, 2008 7:44 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 64
- Joined: Fri Jan 25, 2008 4:41 am
1) Yes, you can do a sparse lookup with the Teradata Connector. Draw a reference link from the Teradata Connector to the Lookup stage. Then open the Teradata Connector stage and there will be a property that allows you to choose between a Normal lookup or a Sparse lookup.
2) Yes, you can use a Teradata API stage to do a sparse lookup on the Server canvas but not on the Parallel canvas.
3) Use the Parallel Synchronization property if you want the Teradata Connector to run in parallel. Otherwise it will run sequential. If you specify a sync table that is unique to that connector instance, you do not need to supply the Sync ID property. If you want to share the same sync table with multiple Teradata Connector stages, each connector must specify a unique Sync ID to distinguish its rows in the sync table from the other connector instances.
2) Yes, you can use a Teradata API stage to do a sparse lookup on the Server canvas but not on the Parallel canvas.
3) Use the Parallel Synchronization property if you want the Teradata Connector to run in parallel. Otherwise it will run sequential. If you specify a sync table that is unique to that connector instance, you do not need to supply the Sync ID property. If you want to share the same sync table with multiple Teradata Connector stages, each connector must specify a unique Sync ID to distinguish its rows in the sync table from the other connector instances.
-
- Participant
- Posts: 64
- Joined: Fri Jan 25, 2008 4:41 am
-
- Participant
- Posts: 64
- Joined: Fri Jan 25, 2008 4:41 am
-
- Premium Member
- Posts: 197
- Joined: Sun Jul 15, 2007 11:45 pm
- Location: Prague
In a normal lookup, the lookup data set is populated at the start of the job from the contents of the lookup table. Each main input row into the lookup stage causes a probe into that data set. With a normal lookup, you have the option of using the connector's Bulk access method to populate the lookup data set. A normal lookup puts less load on the lookup database, since a single query is issued at the start of the job. This can be a good option if the lookup table is small enough to populate the data set.
In a sparse lookup, each main input row causes a probe directly to the database table. Every main input row causes a SELECT to be issued to the lookup database. This could be slower since more queries will be issued, but it may be a better option if the lookup table is large. In a sparse lookup, you can only use Immediate access (CLIv2) since Bulk (TPT Export) doesn't support lookups.
The choice of which option to use depends on the size of the lookup table and how much of it you expect to hit. If the lookup table is large and you only expect to hit a small percentage of rows in that table, use a sparse lookup. If the table is small, you may want to use a normal lookup.
In a sparse lookup, each main input row causes a probe directly to the database table. Every main input row causes a SELECT to be issued to the lookup database. This could be slower since more queries will be issued, but it may be a better option if the lookup table is large. In a sparse lookup, you can only use Immediate access (CLIv2) since Bulk (TPT Export) doesn't support lookups.
The choice of which option to use depends on the size of the lookup table and how much of it you expect to hit. If the lookup table is large and you only expect to hit a small percentage of rows in that table, use a sparse lookup. If the table is small, you may want to use a normal lookup.