How Dataset differ from Sequential File stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
suri
Participant
Posts: 24
Joined: Tue May 25, 2004 12:17 am
Location: piscataway
Contact:

How Dataset differ from Sequential File stage

Post by suri »

Hi All,

Can any one explain me the difference between Sequential File stage and Dataset stage in parallel extender.


Thanks in Advance
Suri
elavenil
Premium Member
Premium Member
Posts: 467
Joined: Thu Jan 31, 2002 10:20 pm
Location: Singapore

Post by elavenil »

When you sequential file stage in PX, the parallel concept is gone. Though there are many differences, this could be one of the main. Pls read the provided document to understand the differences between the two stages.

Regards
Saravanan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

A (persistent) Data Set has rows on every processing node. It can therefore be processed in parallel. Data in a Data Set are in internal format (for example, an int32 occupies four bytes).

A File Set is like a Data Set, except that the data are stored in external (human-readable) format, so require conversion when being brought into or out of the PX environment.

A sequential file is a single operating system file; it can only be accessed on one node. In general it can only be accessed sequentially by a single process (there is one exception, which requires fixed-length structure). Any sequential file must also be converted when being brought into or out of the PX environment.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bgs
Participant
Posts: 22
Joined: Sat Feb 05, 2005 9:43 pm

Post by bgs »

dataset preserves partition.It stores data on the nodes,so when you read from a dataset you dont have to re partition your data.
hondaccord94
Participant
Posts: 46
Joined: Tue Aug 10, 2004 11:07 am
Location: Mclean VA

Post by hondaccord94 »

babu suresh

Dataset Stage : Cryptic broken, understandable to Datastage alone

Sequential Stage : ASCII code , understandable to human eye.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We trust that by "broken" you mean split over the available processing nodes, not "damaged"! :lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply