Sparse Lookup with Teradata Connector stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
shershahkhan
Participant
Posts: 64
Joined: Fri Jan 25, 2008 4:41 am

Sparse Lookup with Teradata Connector stage

Post by shershahkhan »

My Questions are
1) Can Teradata Connector stage be used for sparse lookup? if yes any example would be great.

2) In Server Job can we use Teradata API Stage to do sparse lookup?

3) In Teradata Connector stage what are good practices on the usage of Parallel Synchronization Section.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

1) No. Only DB2 and Oracle are supported for sparse lookup.
[Edit: in version 7.5]

2) Yes. It's probably slow.

3) Experiment to find what settings work best for you. There are too many variables to give a "one size fits all" answer, not least the size and volume of the data you wish to process.
Last edited by ray.wurlod on Fri Aug 15, 2008 7:44 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
shershahkhan
Participant
Posts: 64
Joined: Fri Jan 25, 2008 4:41 am

Post by shershahkhan »

Can we specify the session per Player and requested session etc options in TPT stage or these are just for Teradata Enterprise stage
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

1) Yes, you can do a sparse lookup with the Teradata Connector. Draw a reference link from the Teradata Connector to the Lookup stage. Then open the Teradata Connector stage and there will be a property that allows you to choose between a Normal lookup or a Sparse lookup.

2) Yes, you can use a Teradata API stage to do a sparse lookup on the Server canvas but not on the Parallel canvas.

3) Use the Parallel Synchronization property if you want the Teradata Connector to run in parallel. Otherwise it will run sequential. If you specify a sync table that is unique to that connector instance, you do not need to supply the Sync ID property. If you want to share the same sync table with multiple Teradata Connector stages, each connector must specify a unique Sync ID to distinguish its rows in the sync table from the other connector instances.
shershahkhan
Participant
Posts: 64
Joined: Fri Jan 25, 2008 4:41 am

Post by shershahkhan »

1) Great, i will try to do some test. Can you provide some reference document, because i am not able to find details about Lookup Type in the documentation but it do show this sparse option on the canvas.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

See the Connectivity Guide for Teradata Databases that comes with the Teradata documentation.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

Sorry, I meant Information Server documentation, not Teradata documentation.
shershahkhan
Participant
Posts: 64
Joined: Fri Jan 25, 2008 4:41 am

Post by shershahkhan »

The trick is to use ORCHESTRATE Keyword when you want to reference the main stream column name in SQL of Teradata Connector. Eg
Select col1,col2 from TableName Where Col1=ORCHESTRATE.ColName
DS_FocusGroup
Premium Member
Premium Member
Posts: 197
Joined: Sun Jul 15, 2007 11:45 pm
Location: Prague

Post by DS_FocusGroup »

Can anyone tell difference between normal and sparse lookup options that we have in connector stage.
toshea
Participant
Posts: 79
Joined: Thu Aug 14, 2008 6:46 pm

Post by toshea »

In a normal lookup, the lookup data set is populated at the start of the job from the contents of the lookup table. Each main input row into the lookup stage causes a probe into that data set. With a normal lookup, you have the option of using the connector's Bulk access method to populate the lookup data set. A normal lookup puts less load on the lookup database, since a single query is issued at the start of the job. This can be a good option if the lookup table is small enough to populate the data set.

In a sparse lookup, each main input row causes a probe directly to the database table. Every main input row causes a SELECT to be issued to the lookup database. This could be slower since more queries will be issued, but it may be a better option if the lookup table is large. In a sparse lookup, you can only use Immediate access (CLIv2) since Bulk (TPT Export) doesn't support lookups.

The choice of which option to use depends on the size of the lookup table and how much of it you expect to hit. If the lookup table is large and you only expect to hit a small percentage of rows in that table, use a sparse lookup. If the table is small, you may want to use a normal lookup.
Post Reply