Page 1 of 1

using fileset lookup stage or not?

Posted: Fri Feb 26, 2010 6:59 am
by Aggie99
I have been using datasets with the lookup stage. But then I read about the fileset lookup stage.


What is the difference between using

one master dataset, and one reference dataset and a lookup stage to join on a key

versus

one master dataset, one fileset lookup stage, and one lookup stage to join on a key.

Is it mainly the size of the data, over 2GB or not?


thanks again.

Posted: Fri Feb 26, 2010 7:02 am
by ArndW
According to the documentation the fileset will have better performance, since the data is stored in correct form. I've not noticed significant performance differences, but lookup filesets cannot be 'viewed', which is a distinct disadvantage.

Posted: Fri Feb 26, 2010 8:34 am
by chulett
I would think the 'performance gain' comes when you use the Lookup Fileset in multiple jobs or steps in a job. Take the time to build it once and then leverage it many times.