Page 1 of 1

diff between Data set, sequential file stage & file stag

Posted: Mon Aug 08, 2005 12:05 am
by balaid
Hi,
I am new to data stage Parallel xtender, i would like to know the
diff between Data set, sequential file stage & file stage, use of it in various scenarios...

can any one help me...

Thanks
sundar

Posted: Mon Aug 08, 2005 12:27 am
by ray.wurlod
Welcome aboard! :D

Reading the appropriate chapters describing each stage type in the Parallel Job Developer's Guide (parjdev.pdf) will aid your understanding.

A quick, and necessarily incomplete, summary is:
  • A Sequential File stage accesses regular operating system files, such as CSV files, text files, and so on. In general that access must be sequential rather than parallel.

    A Data Set stage accesses a persistent Data Set, which is an on-disk copy of a virtual Data Set; the set of partitioned data with which all operators in parallel jobs deal. Data in Data Sets are in internal format, particularly numeric data are stored in binary form.

    A File Set stage accesses a File Set, which is partitioned across all nodes specified in the configuration file (as is a Data Set) but which contains human-readable data in each of its files. A Lookup File Set includes a key definition in its schema.

Posted: Mon Aug 08, 2005 4:20 am
by kumar_s
I think this is nth time ray is explaining to freshers reg the same topic.

Hi sundar,
if u make a search on the same topic what u have posted in this forum, u may get more info............. :lol:

regards
kumar