using fileset lookup stage or not?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Aggie99
Participant
Posts: 54
Joined: Thu Sep 04, 2008 6:54 pm

using fileset lookup stage or not?

Post by Aggie99 »

I have been using datasets with the lookup stage. But then I read about the fileset lookup stage.


What is the difference between using

one master dataset, and one reference dataset and a lookup stage to join on a key

versus

one master dataset, one fileset lookup stage, and one lookup stage to join on a key.

Is it mainly the size of the data, over 2GB or not?


thanks again.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

According to the documentation the fileset will have better performance, since the data is stored in correct form. I've not noticed significant performance differences, but lookup filesets cannot be 'viewed', which is a distinct disadvantage.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I would think the 'performance gain' comes when you use the Lookup Fileset in multiple jobs or steps in a job. Take the time to build it once and then leverage it many times.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply