Fileset vs Dataset

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dstest
Participant
Posts: 66
Joined: Sun Aug 19, 2007 10:52 pm

Fileset vs Dataset

Post by dstest »

I would like to know which stage is faster compare with fileset and dataset.Becuase both operates in parallel mode and the problem with dataset is it is using more space when compare with sequential file with the same number of records.

Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Data Set is faster because no translation (export/import) is required. Both a File Set and a Data Set will use more space than a sequential file because, among other things, they include information related to the partitioning and sorted order of the rows they contain. Storage of unbounded VarChar data types is also done differently.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply