Multiple lookups , same file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
datastagepraba
Charter Member
Charter Member
Posts: 3
Joined: Sat May 10, 2003 11:08 pm
Location: INDIA

Multiple lookups , same file

Post by datastagepraba »

In my project we have one big classifiction table where we store all the lookups. Is it possible to export it to a sequential file and use it as the only lookup(for all the classification codes) in all the jobs? Each record has to access the file number of times to lookup and get the corresponding codes. Or is it advisable to create multiple lookup files( one for gender , one for role etc.,) and use these as inputs to the lookup stage?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

If you have the information in the same location (i.e. db, connection etc) as that of your source data, you can use SQL to combine them.

Alternatively, if the size of separate lookup (obtained as breakdown of classification types) are much smaller to be loaded in memory and accessed from there from thereon, you can download them and use as lookups.

But it is better to avoid a large lookup.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I would load it into a single lookup fileset and place it on all partitions. I assume when you use it you have two key fields, one being the lookup id and the other being the lookup code. A single large lookup file is going to be easier to manage then lots of small ones.

In your lookup stage you should be able to type a code into the lookup id derivation field and link the lookup code field to the input value.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Did some additional testing and discovered my recommendation wouldn't work! While server job transformers let you type in a constant into a lookup derivation field the parallel lookup stage does not. So if you have one extended lookup file your problem is going to be identifying which type of lookup code you want to get. Most methods of passing the lookup type to the lookup stage are quite messy, such as adding a new input data field with a constant in it, or concatenating the lookup type with the lookup code.

I think given the restrictions of the lookup stage you might be better off going for lots of different lookup files.
nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Post by nkreddy »

Is there a possibility to add that lookup column in the transformer (before the lookup stage) and then do the lookup? We can add all the key columns from all the database lookups in addition to the columns present in the Transformer stage...

For eg:

source -----> Transformer -------> LookUp Stage ------->DataSet or SeqFile

|
|
|
LookUp FileSet
or
Database LookUp

Thanks,
NK

p.s: The LookUp FileSet or Database lookup should be linked to the LookUp Stage....I could never get the drawing straightened out...
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Yes, that's one of the messy workarounds! I would use the Modify stage instead of the Transformer using a new column specifications for each lookup field that are hard coded to the lookup id (PRODCODE, DEPTCODE etc). Modify may be a bit faster then a transformer but a transformer tends to be easier to use.
nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Post by nkreddy »

Vincent,

Would you please clarify what you meant: I wanted to use Modify Stage rather than the Transformer...just to test it.

Let us say, I have 10 columns coming from the source and none of them can be used as a lookup columns.

In that situation, I have to add the lookup column from the database lookup (say APPL_CODE -- VarChar50) to the columns that are going out of the Modify Stage.

So, I would have 11 columns coming into the LookUp Stage now...Is that right?


In the specification part of the Modify Stage, I would add

APPL_CODE:VarChar50 = APPL_CODE
(new_name = old_name)

and then enable the column propagation..


Regards,
NK
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

In the Modify stage I would set the new field, APPL_CODE, to a constant:
APPL_CODE:VarChar50 = "PRODCODE"

This will create a new field called APPL_CODE.
nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Post by nkreddy »

Vincent,

I tried and tried...but couldn't solve this problem. It works with the transformer..

The error messages are as follows:

Error when checking operator: Could not find input field "APPL_CODE"
When checking operator: Modify operator keeps field "APPL_CODE" not present in output interface

Specification: KEEP 'Column1', 'Column2'
Specification: APPL_CODE:VarChar50 = #SRC_APPL_N#

#SRC_APPL_N# is also in the lookup table....

Am I missing something here??

Thanks,
NK
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Is APPL_CODE listed on the output columns for your Modify stage? Check on the output tab.
Post Reply