File Set

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vamsi.4a6
Participant
Posts: 334
Joined: Sun Jan 22, 2012 7:06 am
Contact:

File Set

Post by vamsi.4a6 »

I have not used the file stage till now.Can anybody tell me one practical example where i can use file stage which is not possible with dataset and sequential file? I read so many posts in this site and also theory for fileset.

I am very clear with the differences between all of them
but i am not able to find any real life example.
Thanks and Regards
Vamsi krishna.v
http://datastage-vamsi.blogspot.in/
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you read the pertinent chapters in the Parallel Job Developer's Guide?

Data Sets and File Sets are both parallel mechanisms for data storage. They each have file(s) for each node when being written to.

Data Sets store data in DataStage internal format; primarily this means that numeric data are stored in binary formats. These are very efficient for staging data between jobs; the operator involved is copy

File Sets store data in external (human-readable) format, and therefore have to undergo export when being written to and import when being read from by DataStage. The descriptor file contains the record schema. Unlike Data Sets, the data in File Sets can be readily read by other applications.

You might also become aware of something called a Lookup File Set, which is unlike a File Set in that data are stored in blocks and referenced via a hash index on the designated key value. A Lookup File Set is intended to be used with a Lookup stage in DataStage so as to avoid the overhead of the LUT_CreateOp operator (to build the in-memory reference data and the hash index) at run time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply