External Source Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

External Source Stage

Post by nvalia »

Has anyone worked on the External Source stage?
I need to read a file from a URL path and pick the last record only. Any solution for this?

Regards,
Nirav
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The External Source stage executes a command (possibly a shell script). That command's stdout becomes the output of the External Source stage. Assuming you have a script that can "read a file from a URL path" (whatever that involves), you can pipe its result through tail -1 to get the last line. If it's a regular pathname, you could apply tail -1 directly to that pathname.

Make sure that the record schema on the External Source stage matches exactly what is produced by the command that you invoke.

Beware, too, that without intervention this stage will operate on every processing node. Set its properties so that its execution mode is sequential, and/or that it executes in a node pool containing only one node. Unless, of course, you want that last line on every partition.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

ray, a question about this statement of yours, "this stage will operate on every processing node". Can you tell me how you came to know this? It is not documented anywhere, at least not in the PDFs. Thanks.
Post Reply