Page 1 of 1

File Set

Posted: Wed Jun 18, 2014 7:21 am
by vamsi.4a6
I have not used the file stage till now.Can anybody tell me one practical example where i can use file stage which is not possible with dataset and sequential file? I read so many posts in this site and also theory for fileset.

I am very clear with the differences between all of them
but i am not able to find any real life example.

Posted: Wed Jun 18, 2014 5:04 pm
by ray.wurlod
Have you read the pertinent chapters in the Parallel Job Developer's Guide?

Data Sets and File Sets are both parallel mechanisms for data storage. They each have file(s) for each node when being written to.

Data Sets store data in DataStage internal format; primarily this means that numeric data are stored in binary formats. These are very efficient for staging data between jobs; the operator involved is copy

File Sets store data in external (human-readable) format, and therefore have to undergo export when being written to and import when being read from by DataStage. The descriptor file contains the record schema. Unlike Data Sets, the data in File Sets can be readily read by other applications.

You might also become aware of something called a Lookup File Set, which is unlike a File Set in that data are stored in blocks and referenced via a hash index on the designated key value. A Lookup File Set is intended to be used with a Lookup stage in DataStage so as to avoid the overhead of the LUT_CreateOp operator (to build the in-memory reference data and the hash index) at run time.