Sequential File

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
PhaniKonagalla
Participant
Posts: 7
Joined: Tue Jul 28, 2009 6:09 am
Location: Chennai

Sequential File

Post by PhaniKonagalla »

Hi,

Can we use UNIX commands in Sequential file stage other than FileName.
Can the Sequential file run in parallel?
Phani Kumar
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes and yes.

You can use any well-formed UNIX command as a Filter. The Sequential File stage will consume stdout from the filter command.

The Sequential File stage can run in parallel if you specify multiple readers per node or specify that it is to read more than one file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

you could also read a file parallely by specifying multiple nodes per reader provided its a fixed width file.
- Zulfi
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

ray.wurlod wrote:The Sequential File stage can run in parallel if you specify multiple readers per node or specify that it is to read more than one file.
When a file pattern is used, it was my assumption that the files are concatenated and the result is read sequentially. with the above quote does it mean the files are read in parallel, if yes then is it one reader per file ?
- Zulfi
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Your assumption is correct as far as the default operation of the stage is concerned when using a file pattern: It will concatenate the files prior to reading them.

The files which match a pattern can be read in parallel if the environment variable <a href="http://publib.boulder.ibm.com/infocente ... FILESET</a> is set. The underlying operator then will read the files as if they were members of a fileset. IIRC it will be one reader per file, up to the degree of parallelism in which your job is running.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply