Limiting the number of lookups per stage.

subramanya · Post by **subramanya** » Thu May 05, 2005 7:45 am

Hi,
I have 20 lookups which can be done in a single transformer stage. Whether dividing them across 4 transformers(say 5 lookups each) will increase the performance ?

Thnx
Subs

chulett · Post by **chulett** » Thu May 05, 2005 7:47 am

Give it a shot - let us know!

Too many variables dependant on your particular system and exactly what you are doing in the job for us to give you a definitive answer.

Sainath.Srinivasan · Post by **Sainath.Srinivasan** » Thu May 05, 2005 8:04 am

You can consider grouping lookups so they can be done in a single connection. This will reduce the amount of i/o and improve performance.

Also spreading the lookups on more than one transform will give you clarity.

subramanya · Post by **subramanya** » Thu May 05, 2005 8:05 am

Not Much calculations.

. Its a staging job where i am pooling fields from diffrrent tables.

Subs

mhester · Post by **mhester** » Thu May 05, 2005 9:44 am

Remember that as you add additional transformers the performance of your job will decrease. In a perfect world it's better to have as few transformers as possible, but in your case I would suspect that this is not reasonable. As Craig states - there are many variables here that will impact the performance of the job with one main variable being the size of the reference hash files.

Make sure to enable cache for each reference link. The width of the incoming stream is also important as well as the number of fields within your reference hash files. Many times I see hash usage where you need to reference the customer hash reference lookup and only need customer number. Instead of only choosing those fields required - the developer will load all of columns when in reality only one or two are required. Make sure the metadata for the hash is correct (field positions, key columns etc.) so that you can choose to use only those columns required.

Note: Don't forget to ensure that in the table def record for a hash that you have checked the box that says - "Metadata supports Multi-valued fields". If you have a job that created the file and then imported this file definition then you will see that this is positional and can help to only choose those fields necessary.

Regards.

amsh76 · Post by **amsh76** » Thu May 05, 2005 10:47 am

It all depends on what you are trying to achieve. I would think if you can divide the lookups over two or three transformer and then try to use inter process it will speed up of your process..

Also for read cache, please do not enable it if you are not going to get the same input record again.

This is something you will have to play around and based on the resources availabe you will see difference in speed.

Regards

Dsnew · Post by **Dsnew** » Thu May 05, 2005 10:50 am

mhester

Can you please what is

"Metadata supports Multi-valued fields"

mhester · Post by **mhester** » Thu May 05, 2005 11:04 am

This feature is found on the table def record in the repository.

UniVerse files are positional or the fields are offsets from 0 with 0 being the key. When you create the file (via a DS job) a dictionary will be created that contains the metadata for the file. The key is position 0 and subsequent fields will be 1,2,3,4 etc...

When you first import this table def into the repository and open it for viewing you will notice on the column page that no positions are listed. After you check this box you will then notice that there is now a position and type column. The key will display as 0 and the remaining fields will be 1,2,3 etc...

When loading the metadata for a reference lookup (using a hash) you can choose only those columns that are important and you will notice that the position (once loaded) will be correct.

I also believe that pre-loading the file to memory regardless of if the incoming row will be seen again is valuable - avoids disk reads.

Hope this helps,

Dsnew · Post by **Dsnew** » Thu May 05, 2005 11:32 am

I imported from a table, the colums tab did show column numbers from 1,2,3...
I checked the option "Metadata supports Multi-valued fields" ut it did not make any difference to the sequence i.i still was 1,2,3...
Am i doing the right thing or did you mean something else?

Dsnew · Post by **Dsnew** » Thu May 05, 2005 11:34 am

Can you also tell me the benefit of checking this option please

mhester · Post by **mhester** » Thu May 05, 2005 11:58 am

You need to make sure you are importing the metadata via the manager using the Import>Table Definitions>UniVerse Files

I believe you are referring to the left side column numbers. After you check the "Metadata supports Multi-valued fields" you should see two new columns on the columns page - Postion and Type. These are specific to UniVerse file types - this is what I am referring to.

Dsnew · Post by **Dsnew** » Thu May 05, 2005 1:02 pm

Oh i got it wrong, i was looking into this while exporting data from Oracle.

Is this option "Metadata supports Multi-valued fields" specific to Universe files only?

ray.wurlod · Post by **ray.wurlod** » Thu May 05, 2005 9:25 pm

UniVerse and UniData only.